Basic Technical Questions
Interviewers use easy technical questions designed to weed out candidates without the right experience. This question assesses your experience level, comfort with specific tools, and the depth of your domain expertise. Basic technical questions include:
Describe a time you had difficulty merging data. How did you solve this issue?
Data cleaning and data processing are key job responsibilities in engineering roles. Inevitably unexpected issues will come up. Interviewers ask questions like these to determine:
- How well do you adapt?
- The depth of your experience.
- Your technical problem-solving ability.
- Clearly explain the issue, what you proposed, the steps you took to solve the problem, and the outcome.
Describe a time you had difficulty merging data. How did you solve this issue?
Data cleaning and data processing are key job responsibilities in engineering roles. Inevitably unexpected issues will come up. Interviewers ask questions like these to determine:
- How well do you adapt?
- The depth of your experience.
- Your technical problem-solving ability.
- Clearly explain the issue, what you proposed, the steps you took to solve the problem, and the outcome.
What ETL tools do you have experience using? What tools do you prefer?
There are many variations to this type of question. A different version would be about a specific ETL tool, “Have you had experienced with Apache Spark or Amazon Redshift?” If a tool is in the job description, it might come up in a question like this. One tip: Include any training, how long you’ve used the tech, and specific tasks you can perform.
Tell me about a situation where you dealt with alien technology.
This question asks: What do you do when there are gaps in your technical expertise? In your response, you might include:
- Education and data engineering boot camps
- Self-guided learning
- Working with specialists and collaborators
How would you design a data warehouse given X criteria?
This example is a fundamental case study question in data engineering, and it requires you to provide a high-level design for a database based on criteria. To answer questions like this:
- Start with clarifying questions and state your assumptions
- Provide a hypothesis or high-level overview of your design
- Then describe how your design would work
How would you design a data pipeline?
A broad, beginner case study question like this wants to know how you approach a problem. With all case study questions, you should ask clarifying questions like:
- What type of data is processed?
- How will the information be used?
- What are the requirements for the project?
- How much will data be pulled? How frequently?
These questions will provide insights into the type of response the interviewer seeks. Then, you can describe your design process, starting with choosing data sources and data ingestion strategies, before moving into your developing data processing and implementation plans.
What questions do you ask before designing data pipelines?
This question assesses how you gather stakeholder information before starting a project. Some of the most common questions to ask would include:
- What is the use of the data?
- Has the data been validated?
- How often will the information be pulled, and how is it used
- Who will manage the pipeline?
How do you gather stakeholder input before beginning a data engineering project?
Understanding what stakeholders need from you is essential in any data engineering job, and a question like this assesses your ability to align your work to stakeholder needs. Describe the processes that you typically utilize in your response; you might include tools like:
- Surveys
- Interviews
- Direct observations
- Social science / statistical observation
- Reviewing existing logs of issues or requests
Ultimately, your answer must convey your ability to understand the user and business needs and how you bring stakeholders in throughout the process.
What is your experience with X skill on Python?
General experience questions like this are jump-off points for more technical case studies. And typically, The interviewer will tailor questions as they pertain to the role. However, you should be comfortable with standard Python and supplemental libraries like Matplotlib, Pandas, and NumPy, know what’s available, and understand when it’s appropriate to use each library.
One note: Don’t fake it. If you don’t have much experience, be honest. You can also describe a related skill or talk about your comfort level in quickly picking up new Python skills (with an example).
What experience do you have with cloud technologies?
If cloud technology is in the job description, chances are it will show up in the interview. Some of the most common cloud technologies for data engineer interviews include Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and IBM Cloud. Additionally, be prepared to discuss specific tools for each platform, like AWS Glue, EMR, and AWS Athena.
What are some challenges unique to cloud computing?
A broad question like this can quickly assess your experience with cloud technologies in data engineering. Some of the challenges you should be prepared to talk about include:
- Security and Compliance
- Cost
- Governance and control
- Performance
What’s the difference between structured and unstructured data?
With a fundamental question like this, be prepared to answer with a quick definition and then provide an example.
You could say: “Structured data consists of clearly defined data types and easily searchable information. An example would be customer purchase information stored in a relational database. Unstructured data, on the other hand, does not have a clearly defined format, and therefore, a relational database can’t store it in a relational database. An example would be video or image files.”
What are the key features of Hadoop?
Some of the Hadoop features you might talk about in a data engineering interview include:
- Fault tolerance
- Distributed processing
- Scalability
- Reliability