Big data internship Interview Questions

L1 -Techincal Inyroduce yourself What is your project What are your source data types ? -csv/RDBMS how you get it? How bigger was the client cluster? What was data size? Load was daily , weekly or month? Why client selected hadoop rather than RDBMS? Which tool for workfow? What is staging in spark? What is RDD? What is intention behing lazy evaluation? What is intention behind keeping RDD immutable/unable to update? You have written multiple tranformations on your RDD but still you have not fired any action. How your spark server WEB UI will look like? Suppose you fired action on on RDD what exactly happens internally in spark ?(Here I told about it goes backword 1 by 1 to created required RDD using lineage graph in backword direction and first RDD is calculated and again return back to action) Which are the transformation in spark? I have given you an RDD . how will you convert it to paired RDD uisng its first element as key? ans- RDD2=RDD1.map(lambda x:(x[1], x)) What is difference between hadoop 2X and 1X ? What is HA concept? What if Name node failed? What to do and who was doing in your project? What is heartbeats concept? I have file 500 MB on hadoop 2x .how much block and replicas will be there ? I have a file home_id product meter h1 p1 20 h1 p2 30 H2 p2 23 I want to create partitions with the key home id.How will do it on local file system without suing SPARK, HIVE ,MAP reduce. Use simle programing language like java/python. Later how will you do it in hive and spark? 21. I have an 3x3 ARRAY which is sorted 1 3 5 7 8 9 11 15 18 Write a program so that if use passed any element from terminal, it will return its exact position in array. (i did as below ) a=int[3][3] a=[(1,3,5),(7,8,9),(11,15,18)] x=int(std.input()) --user input For i in 1 to 3 For j 1 to 3 If x ==a[i][j] Then print(‘location of x in %i %j’,i,j) L2 : technical 1.there is file Name id Ajay 1 Ram 2 Ajay 3 Ram 4 Jack 6 Devid 7 ID is unique and Name might be repeatble. Write program so that user will enter name ‘ajay’ then program will return list of IDs -[1,3] Input Ram : output [2,4]

Big Data Engineer

Interviewed at Persistent Systems

4.2★

Oct 28, 2018

L1 -Techincal Inyroduce yourself What is your project What are your source data types ? -csv/RDBMS how you get it? How bigger was the client cluster? What was data size? Load was daily , weekly or month? Why client selected hadoop rather than RDBMS? Which tool for workfow? What is staging in spark? What is RDD? What is intention behing lazy evaluation? What is intention behind keeping RDD immutable/unable to update? You have written multiple tranformations on your RDD but still you have not fired any action. How your spark server WEB UI will look like? Suppose you fired action on on RDD what exactly happens internally in spark ?(Here I told about it goes backword 1 by 1 to created required RDD using lineage graph in backword direction and first RDD is calculated and again return back to action) Which are the transformation in spark? I have given you an RDD . how will you convert it to paired RDD uisng its first element as key? ans- RDD2=RDD1.map(lambda x:(x[1], x)) What is difference between hadoop 2X and 1X ? What is HA concept? What if Name node failed? What to do and who was doing in your project? What is heartbeats concept? I have file 500 MB on hadoop 2x .how much block and replicas will be there ? I have a file home_id product meter h1 p1 20 h1 p2 30 H2 p2 23 I want to create partitions with the key home id.How will do it on local file system without suing SPARK, HIVE ,MAP reduce. Use simle programing language like java/python. Later how will you do it in hive and spark? 21. I have an 3x3 ARRAY which is sorted 1 3 5 7 8 9 11 15 18 Write a program so that if use passed any element from terminal, it will return its exact position in array. (i did as below ) a=int[3][3] a=[(1,3,5),(7,8,9),(11,15,18)] x=int(std.input()) --user input For i in 1 to 3 For j 1 to 3 If x ==a[i][j] Then print(‘location of x in %i %j’,i,j) L2 : technical 1.there is file Name id Ajay 1 Ram 2 Ajay 3 Ram 4 Jack 6 Devid 7 ID is unique and Name might be repeatble. Write program so that user will enter name ‘ajay’ then program will return list of IDs -[1,3] Input Ram : output [2,4]

Spark questions Bucketing and partitioning Hive related questions Difference between tuple and list

Big Data Engineer

Interviewed at Infosys

3.6★

Mar 27, 2022

Spark questions Bucketing and partitioning Hive related questions Difference between tuple and list

Write a programme to check two strings are annagram.

Big Data

Interviewed at Absolutdata

3.5★

Dec 1, 2017

Write a programme to check two strings are annagram.

A lot of conceptual questions about when and why I would use certain tech. A lot of questions about past experience and why I done certain things in certain scenarios.

Big Data Developer

Interviewed at StackPros

2.9★

Jul 12, 2018

A lot of conceptual questions about when and why I would use certain tech. A lot of questions about past experience and why I done certain things in certain scenarios.

Sql queries

Big Data Engineer

Interviewed at TravelTriangle

4★

Feb 14, 2020

Sql queries

Architecture of project

Big Data Engineer

Interviewed at TravelTriangle

4★

Feb 14, 2020

Architecture of project

questions about additional things to be taken care of moving my code into production and how design can be improved, test-driven and domain-driven development, Kafka, the difference between list and set, GraphQL, Graph Database,REST API and design considerations etc

Big Data Engineer

Interviewed at Sentiance

4★

Jun 18, 2020

questions about additional things to be taken care of moving my code into production and how design can be improved, test-driven and domain-driven development, Kafka, the difference between list and set, GraphQL, Graph Database,REST API and design considerations etc

Basics fundamentals of programming like C, Java, Oops etc.

Big Data Analyst

Interviewed at Synchronoss

3.9★

Oct 1, 2020

Basics fundamentals of programming like C, Java, Oops etc.

A simple algorithm question which I don't remember

Big Data Engineer

Interviewed at BMW Group

4.2★

Jan 25, 2016

A simple algorithm question which I don't remember

What is your experience with big data?

Big Data Engineer

Interviewed at NOMIS SOLUTIONS

3.6★

Feb 14, 2016

What is your experience with big data?

Big Data Internship Interview Questions

1,784 big data internship interview questions shared by candidates

See Interview Questions for Similar Jobs