AWS, Google— Data Engineer Interview Questions

AWS

Role of Data Engineers at Amazon

  • Collect, store and manage huge quantities of data.
  • Convert raw data into information that can be used to make decisions.
  • Be at the forefront of data-driven decision-making by working closely with data scientists, product managers, and software engineers.
  • Build and maintain database architectures.
  • Coordinate with product managers, software engineers, and data scientists to work on common projects that involve leveraging datasets.
  • Leverage SQL and programming to build algorithms.
  • Perform data modeling and carry out ETL design, keeping with best practices.

Skills and Qualifications Required to Be a Data Engineer at Amazon

  • 4+ years of experience in Python, SQL and ETL design
  • Proven experience in data modeling and building data pipeline architectures
  • 3+ years of experience in big data analytics, with workflow management engines(Airflow, AWS Step Functions, Google Cloud Composer…)
  • Proven experience in working with cloud analytics platforms or MPP analytics platforms such as AWS Redshift, Google Big Query, Teradata, or Netezza
  • Proven experience in SQL Performance Tuning
  • Proven experience in designing database pipeline architectures
  • Experience in using Big data analytics tools such as Spark, Impala, Hive, Presto
  • Experience in E2E process optimization
  • Experience with anomaly/outlier detection

Interview Questions

  • Algorithms and data structures
  • Metric and visualization solution designs
  • Spark, EMR
  • Reporting tools like Tableau and Excel
  • SQL
  • Data pipeline design
  • DB performance tuning
  • Statistics and modeling

Leadership Principle

  • Customer Obsession: meet customer expectations
  • Ownership: beyond your job responsibilities and work on a challenging project
  • Insist on the Highest Standards: improve quality of a project and motivate others
  • Think Big: significant professional achievement, make a bold and challenging decision, great impact
  • Bias for Action: take a calculated risk, and take the initiative to correct a problem
  • Earn Trust: speak up in a difficult or uncomfortable environment, gain the trust of your team
  • Dive Deep: complicated problem you’ve had to deal with, utilize in-depth data
  • Have Backbone; Disagree and Comment: something you believe in that nobody else does
  • Deliver Results: push something to deliver results even though team give up on something

STAR: answer clearly in the form below

  • Situation
  • Task
  • Action
  • Result

Google

https://www.stratascratch.com/blog/google-data-scientist-position-guide

  • Minimum Qualifications: 2+ year of software development, data engineering, business intelligence, data science, or related field with experience in manipulating, processing, and extracting value from datasets. 4+ years of experience in designing, building, and deploying cloud-based solution architectures.
  • Collaborative Experience: skillfully communicating, organizing and analyzing
  • Analytical Experience: Experiencing Designing Data Models and Data Warehouses and using SQL and NoSQL database management systems.
  • Preferred Qualifications: Master’s degree

Behavioral Questions

  • What is your 5-year professional plan?
  • Describe a time you failed to reach a goal.
  • Describe how you work effectively with others and achieve the desired result.
  • Tell me about a project you’re proud of.

Product sense & Business Cases

You have to interested in domain specified data.

  • What kind of spam will you have on YouTube, and how to deal with them?
  • How would you explain cloud computing to a 6-year-old?
  • How many cans of blue paint were sold in the United states last year?

Data Analysis & Coding

  • Find the number of emails received by each user under each built-in email label. The email labels are ~
  • Find the total AdWords earnings for each business type. Output the business types along with the total earnings.
  • Write a code to generate random normal distribution and plot it.

Modeling Techniques

  • Why use feature selection? If two predictors are highly correlated, what is the effect on the coefficients in the logistic regression? What are the confidence intervals of coefficients?
  • For sample size n, the margin of error in 3. How many more samples do we need to make the margin of error to 0.3?

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
SoniaComp

SoniaComp

Data Engineer interested in constructing Data-Driven Architecture with Cloud Service (https://www.linkedin.com/in/sonia-comp/)