
Microsoft Analyzing Big Data with Microsoft R - 070-773 Exam Questions
QUESTION NO: 1
You are running a parallel function that uses the following R code segment. (Line numbers are included for reference only.)
0 1 cp <- 0.01 xval <- 0 maxdepth <- 5
0 2
0 3 (form, data = "segmentationDataBig", maxDepth = maxdepth, cp = cp, xval = xval, blocksPerRead = 250 You need to complete the R code. The solution must support chunking.
Which function should insert at line 02?
You are running a parallel function that uses the following R code segment. (Line numbers are included for reference only.)
0 1 cp <- 0.01 xval <- 0 maxdepth <- 5
0 2
0 3 (form, data = "segmentationDataBig", maxDepth = maxdepth, cp = cp, xval = xval, blocksPerRead = 250 You need to complete the R code. The solution must support chunking.
Which function should insert at line 02?
Correct Answer: C
QUESTION NO: 2
Note: This question is part of a series of questions that use the same or similar answer choices. An answer choice may be correct for more than one question in the series. Each question is independent of the other questions in this series. Information and details provided in a question apply only to that question.
You need to evaluate the significance of coefficients that are produced by using a model that was estimated already.
Which function should you use?
Note: This question is part of a series of questions that use the same or similar answer choices. An answer choice may be correct for more than one question in the series. Each question is independent of the other questions in this series. Information and details provided in a question apply only to that question.
You need to evaluate the significance of coefficients that are produced by using a model that was estimated already.
Which function should you use?
Correct Answer: G
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 3
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this sections, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You use dplyrXdf, and you discover that after you exit the session, the output files that were created were deleted.
You need to prevent the files from being deleted.
Solution: You remove all instances of the file.remove method.
Does this meet the goal?
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this sections, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You use dplyrXdf, and you discover that after you exit the session, the output files that were created were deleted.
You need to prevent the files from being deleted.
Solution: You remove all instances of the file.remove method.
Does this meet the goal?
Correct Answer: B
QUESTION NO: 4
You are running a large logistic regression for 1,000 feature variables by using the LoisticRegression() function in the MicrosoftML package. All of the predictor variables are numeric.
Currently, you specify the input variables separately by using the following formula.
Outcome ~ Feature000 + Feature001 + Feature002 + ... + Feature999
You discover that it takes 20 minutes to estimate each model.
You need to reduce the amount of time required to estimate each model without losing any information in the predictors.
What should you do?
You are running a large logistic regression for 1,000 feature variables by using the LoisticRegression() function in the MicrosoftML package. All of the predictor variables are numeric.
Currently, you specify the input variables separately by using the following formula.
Outcome ~ Feature000 + Feature001 + Feature002 + ... + Feature999
You discover that it takes 20 minutes to estimate each model.
You need to reduce the amount of time required to estimate each model without losing any information in the predictors.
What should you do?
Correct Answer: C




