This section focuses on "Basics" of Data Science. These Data Science Multiple Choice Questions (MCQ) should be practiced to improve the skills required for various interviews (campus interview, walk-in interview, company interview), placements, entrance exams and other competitive examinations.
1. Data science is the process of diverse set of data through ?
A. organizing data
B. processing data
C. analysing data
D. All of the above
Explanation: Data science is the process of deriving knowledge and insights from a huge and diverse set of data through organizing, processing and analysing the data.
2. The modern conception of data science as an independent discipline is sometimes attributed to?
A. William S.
B. John McCarthy
C. Arthur Samuel
D. Satoshi Nakamoto
Explanation: Data science developed by William S.
3. Which of the following language is used in Data science?
Explanation: R is free software for statistical computing and analysis.
4. Which of the following is false?
A. Subsetting can be used to select and exclude variables and observations
B. Raw data should be processed only one time.
C. Merging concerns combining datasets on the same observations to produce a result with more variables
D. None Of the above
Explanation: Raw data may only need to be processed once.
5. What is the work of Data Architect?
A. utilize large data sets to gather information that meets their company's needs
B. work with businesses to determine the best usage of the information yielded from data
C. build data solutions that are optimized for performance and design applications
D. All of the above
Explanation: Data Architect: Data architects build data solutions that are optimized for performance and design applications.
6. Which of the following is correct skills for a Data Scientist?
A. Probability & Statistics
B. Machine Learning / Deep Learning
C. Data Wrangling
D. All of the above
Explanation: All of the above is the correct skills for a Data Scientist.
7. Which of the following are correct component for data science?
A. Data Engineering
B. Advanced Computing
C. Domain expertise
D. All of the above
Explanation: All are correct component for data science
8. Which of the following is not a part of data science process?
A. Discovery
B. Model Planning
C. Communication Building
D. Operationalize
Explanation: Communication Building is not a part of data science process.
9. Which of the following are the Data Sources in data science?
A. Structured
B. UnStructured
C. Both A and B
D. None Of the above
Explanation: Structured and Unstructured data. Like logs, SQL, NoSQL, or text
10. Which of the following is not a application for data science?
A. Recommendation Systems
B. Image & Speech Recognition
C. Online Price Comparison
D. Privacy Checker
Explanation: Privacy Checker is not a application for data science
11. Point out the correct statement.
A. Raw data is original source of data
B. Preprocessed data is original source of data
C. Raw data is the data obtained after processing steps
D. None of the above
Explanation: Accounting programs are prototypical examples of data processing applications.
12. Which of the following is one of the key data science skills?
A. Statistics
B. Machine Learning
C. Data Visualization
D. All of the above
Explanation: Data visualization is the presentation of data in a pictorial or graphical format.
13. Which of the following is a key characteristic of a hacker?
A. Afraid to say they don't know the answer
B. Willing to find answers on their own
C. Not Willing to find answers on their own
D. All of the above
Explanation: Hacker is an expert at programming and solving problems with a computer.
14. Raw data should be processed only one time.
A. True
B. False
C. Can be true or false
D. Can not say
Explanation: Raw data may only need to be processed once.
15. Which of the following is the common goal of statistical modelling?
A. Inference
B. Summarizing
C. Subsetting
D. None of the above
Explanation: Inference is the act or process of deriving logical conclusions from premises known or assumed to be true.
16. Causal analysis is commonly applied to census data.
A. True
B. False
C. Can be true or false
D. Can not say
Explanation: Descriptive analysis is commonly applied to census data.
17. Which of the following model is usually a gold standard for data analysis?
A. Inferential
B. Descriptive
C. Causal
D. All of the above
Explanation: A causal model is an abstract model that describes the causal mechanisms of a system.
18. Which of the following is a revision control system?
A. Git
B. Numpy
C. Scipy
D. Slidify
Explanation: Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
19. Which of the following step is performed by data scientist after acquiring the data?
A. Data Cleaning
B. Data Integration
C. Data Replication
D. All of the above
Explanation: Data cleaning, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database.
20. Which of the following focuses on the discovery of (previously) unknown properties on the data?
A. Data mining
B. BigData
C. Data wrangling
D. Machine Learning
Explanation: Data munging or data wrangling is loosely the process of manually converting or mapping data from one "raw" form into another format that allows for more convenient consumption of the data with the help of semi-automated tools.
21. Data can be categorized into ______ groups.
Explanation: Data can be categorized into two groups: Structured data and Unstructured data
22. Unstructured data is not organized.
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say
Explanation: True, Unstructured data is not organized. We must organize the data for analysis purposes.
23. A column is a ________ representation of data.
A. horizontal
B. diagonal
C. vertical
D. top
Explanation: A column is a vertical representation of data.
24. A ________ is a structured representation of data.
A. database table
B. functions
C. data prepration
D. data frame
Explanation: A data frame is a structured representation of data.
25. We write ______ in front of mean to let Python know that we want to activate the mean function from the Numpy library.
Explanation: We write np. in front of mean to let Python know that we want to activate the mean function from the Numpy library.
Also check :