Microsoft Analyzing Big Data with Microsoft R - 070-773 Exam Questions [2026]

QUESTION NO: 1
You are running a parallel function that uses the following R code segment. (Line numbers are included for reference only.)
0 1 cp <- 0.01 xval <- 0 maxdepth <- 5
0 2
0 3 (form, data = "segmentationDataBig", maxDepth = maxdepth, cp = cp, xval = xval, blocksPerRead = 250 You need to complete the R code. The solution must support chunking.
Which function should insert at line 02?

A. rxBTrees B. rxExec C. rxDTree D. rxDForest

Correct Answer: C

QUESTION NO: 2
Note: This question is part of a series of questions that use the same or similar answer choices. An answer choice may be correct for more than one question in the series. Each question is independent of the other questions in this series. Information and details provided in a question apply only to that question.
You need to evaluate the significance of coefficients that are produced by using a model that was estimated already.
Which function should you use?

A. rxTweedie B. rxTransform C. rxLogit D. rxPredict E. rxDataStep F. stepAic G. rxLinMod H. summary

Correct Answer: G

Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).

QUESTION NO: 3
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this sections, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You use dplyrXdf, and you discover that after you exit the session, the output files that were created were deleted.
You need to prevent the files from being deleted.
Solution: You remove all instances of the file.remove method.
Does this meet the goal?

A. Yes B. No

Correct Answer: B

QUESTION NO: 4
You are running a large logistic regression for 1,000 feature variables by using the LoisticRegression() function in the MicrosoftML package. All of the predictor variables are numeric.
Currently, you specify the input variables separately by using the following formula.
Outcome ~ Feature000 + Feature001 + Feature002 + ... + Feature999
You discover that it takes 20 minutes to estimate each model.
You need to reduce the amount of time required to estimate each model without losing any information in the predictors.
What should you do?

A. Use selectFeatures() to select the features that provide the most information about the outcome variable. B. Use princomp() on the correlation matrix of Features, and then use only the first 100 principle components to reduce the number of input variables. C. Use concat() to create a single array variable named Features, and then specify a new formula named Outcome ~ Features. D. Use stepControl() to perform stepwise regression to limit the number of variables that contribute to the model.

Correct Answer: C