Accounting
Anthropology
Archaeology
Art History
Banking
Biology & Life Science
Business
Business Communication
Business Development
Business Ethics
Business Law
Chemistry
Communication
Computer Science
Counseling
Criminal Law
Curriculum & Instruction
Design
Earth Science
Economic
Education
Engineering
Finance
History & Theory
Humanities
Human Resource
International Business
Investments & Securities
Journalism
Law
Management
Marketing
Medicine
Medicine & Health Science
Nursing
Philosophy
Physic
Psychology
Real Estate
Science
Social Science
Sociology
Special Education
Speech
Visual Arts
Business Development
Q:
A decision maker is considering including two additional variables into a regression model that has as the dependent variable, Total Sales. The first additional variable is the region of the country (North, South, East, or West) in which the company is located. The second variable is the type of business (Manufacturing, Financial, Information Services, or Other). Given this, how many additional variables will be incorporated into the model?
A) 2
B) 6
C) 8
D) 9
Q:
Golf handicaps are used to allow players of differing abilities to play against one another in a fair match. Recently a sample of golfers was selected in an effort to develop a model for explaining the difference in handicaps. One independent variable of interest is the number of rounds played per year. Another is whether or not the player is using an "original" name brand club or a copy. In recent years, a number of smaller golf club manufacturers have attempted to copy major golf club designs and sell "copies" of original clubs such as the Big Bertha by Calloway. The resulting regression analysis containing both Rounds Played and a Dummy variable for Club Used is shown as follows:Given this information, which of the following statements is not correct?A) The overall regression model is insignificant at the alpha = .05 level.B) The Club Dummy variable is statistically significant in the model meaning that knowing that a player used an original club or copy is of value in knowing the player's handicap.C) The two independent variables do not explain a statistically significant portion of the variation in golf handicap.D) All of the above are not correct.
Q:
The following multiple regression output was generated from a study in which two independent variables are included. The first independent variable (X1) is a quantitative variable measured on a continuous scale. The second variable (X2) is qualitative coded 0 if Yes, 1 if No.Based on this information, which of the following statements is true?A) The model explains nearly 63 percent of the variation in the dependent variableB) If tested at the 0.05 significance level, the overall model would be considered statistically significant.C) The variable X1 has a slope coefficient that is significantly different from zero if tested at the 0.05 level of significance.D) All of the above are true.
Q:
Which of the following statements is true?
A) Dummy variables are used to incorporate categorical variables into a regression model.
B) You should use one fewer dummy variables than are categories for the qualitative variable in question.
C) It is appropriate to compute a correlation coefficient for the relationship between a dependent variable and a dummy variable.
D) All of the above are true.
Q:
If a decision maker wishes to develop a regression model in which the University Class Standing is a categorical variable with 5 possible levels of response, then he will need to include how many dummy variables?
A) 5
B) 4
C) 1
D) 3
Q:
In a multiple regression analysis involving 15 independent variables and 200 observations, SST = 800 and SSE = 240. The adjusted coefficient of determination is
A) 0.15
B) 0.50
C) 0.66
D) 0.70
Q:
A multiple regression is shown for a data set of yachts where the dependent variable is the price in thousands of dollars.Given this information, which is correct regarding the test of the overall model using the 0.10 level of significance?A) The overall model does not have significant ability to predict the price of a yacht because p-value = .163 is greater than 0.10B) The overall model has significant ability to predict the price of a yacht because p-value = 0.163 is greater than 0.10C) The overall model does not have significant ability to predict the price of a yacht because p-value = .001 is less than 0.10D) The overall model has significant ability to predict the price of a yacht because p-value = .001 is less than 0.10
Q:
A multiple regression is shown below for a data set of yachts where the dependent variable is the price of the boat in thousands of dollars.Given this information, what percentage of variation in the dependent variable is explained by the regression model?A) Approximately 68 percentB) About 83 percentC) About 37 percentD) About 60 percent
Q:
Which of the following statements is true?
A) If the confidence interval estimate for the regression slope coefficient, based on the sample information, crosses over zero, the true population regression slope coefficient could be zero.
B) R-square will tend to be smaller than the adjusted R-squared values when insignificant independent variables are included in the model.
C) The y-intercept will usually be negative in a multiple regression model when the regression slope coefficients are predominately positive.
D) None of the above
Q:
Which of the following regression output values is used in computing the variance inflation factors?
A) The standard error of the estimate
B) The regression intercept value
C) The F critical value from the F distribution for the appropriate number of degrees of freedom and the appropriate level of significance
D) The R-squared value
Q:
Under what circumstances does the variance inflation factor signal that multicollinearity may be a problem?
A) When the value of VIF exceeds the size of the sample from which the regression model was developed
B) When the VIF value is approximately 1.0
C) When the VIF is greater than or equal to 5
D) When the VIF is a negative value
Q:
Which of the following is not an indication of potential multicollinearity problems?
A) The sign on the standard error of the estimate is positive.
B) A sign on a regression slope coefficient is negative when the sign on the correlation coefficient was positive.
C) The standard error of the estimate increases when a variable enters the model in the presence of other independent variables.
D) An independent variable goes from being statistically significant to being insignificant when a new variable is added to the model.
Q:
In a multiple regression model, which of the following is true?
A) The coefficient of determination will be equal to the square of the highest correlation in the correlation matrix.
B) Adding variables that have a low correlation with the dependent variable will cause the R-square value to decline.
C) The sum of the residuals computed for the least squares regression equation will be zero.
D) The adjusted R-square might be higher or lower than the value of the R-square.
Q:
Which of the following is not an assumption of the multiple regression model?
A) The mean of the residuals is equal to the variance at all combinations of levels of the independent variables.
B) The regression error terms are normally distributed.
C) The model error terms are independent.
D) The residuals have a constant variance for all combinations of values for the independent variables.
Q:
To check out whether the regression assumption involving normality of the error terms is valid, it is appropriate to construct a normal probability plot. If this plot forms a straight line from the lower left-hand corner diagonally up to the upper right-hand corner, the error terms may be assumed to be normally distributed.
Q:
The following residual plot was constructed based on a simple linear regression model.Based on this plot, there appears to be no basis for concluding that a curvilinear model may be more appropriate than a linear model to explain the variation in the y variable.
Q:
The following residual plot is an output of a regression model.Based on this residual plot, there is evidence to suggest that the underlying relationship between the y variable and the x variable is nonlinear.
Q:
What factors are of importance to an analyst when linear regression analysis is used for descriptive purposes?
Q:
Explain why it is important to construct scatter plots prior to conducting regression analysis.
Q:
Explain what the correlation coefficient measures and some detail of the key issues associated with it. Be sure to also discuss the concept of spurious correlation.
Q:
A random sample of two variables, x and y, produced the following observations: x y 19 7 13 9 17 8 9 11 12 9 25 6 20 7 17 8 Test to determine whether the population correlation coefficient is negative. Use a significance level of 0.05 for the hypothesis test.
A) Because t = -4.152 < -1.9432, reject the null hypothesis. Because the null hypothesis is rejected, the sample data does support the hypothesis that there is a negative linear relationship between x and y.
B) Because t = -4.152 < -1.9432, do not reject the null hypothesis. Because the null hypothesis is not rejected, the sample data support the hypothesis that there is a negative linear relationship between x and y.
C) Because t = -9.895 < -1.9432, reject the null hypothesis. Because the null hypothesis is rejected, the sample data does support the hypothesis that there is a negative linear relationship between x and y.
D) Because t = -9.895 < -1.9432, do not reject the null hypothesis. Because the null hypothesis is not rejected, the sample data support the hypothesis that there is a negative linear relationship between x and y.
Q:
The following data for the dependent variable, y, and the independent variable, x, have been collected using simple random sampling: x y 10 120 14 130 16 170 12 150 20 200 18 180 16 190 14 150 16 160 18 200 Compute the correlation coefficient.
A) 0.52
B) 0.71
C) 0.62
D) 0.89
Q:
An industry study was recently conducted in which the sample correlation between units sold and marketing expenses was 0.57. The sample size for the study included 15 companies. Based on the sample results, test to determine whether there is a significant positive correlation between these two variables. Use an alpha = 0.05
A) Because t = 2.50 > 1.7709, do not reject the null hypothesis. There is not sufficient evidence to conclude there is a positive linear relationship between sales units and marketing expense for companies in this industry.
B) Because t = 2.50 > 1.7709, reject the null hypothesis. There is sufficient evidence to conclude there is a positive linear relationship between sales units and marketing expense for companies in this industry.
C) Because t = 3.13 > 1.7709, do not reject the null hypothesis. There is not sufficient evidence to conclude there is a positive linear relationship between sales units and marketing expense for companies in this industry.
D) Because t = 3.13 > 1.7709, reject the null hypothesis. There is sufficient evidence to conclude there is a positive linear relationship between sales units and marketing expense for companies in this industry.
Q:
If a residual plot exhibits a curved pattern in the residuals, this means that:
A) the errors are not normally distributed.
B) there must be a curvilinear relation between x and y.
C) there is no significant relation between x and y.
D) there is a problem with constant variance.
Q:
Which of the following is a correct interpretation for the regression slope coefficient?
A) For a one-unit change in y, we can expect the value of the independent variable to change by b1units on average.
B) For each unit change in x, the dependent variable will change by b1units.
C) The average change in y of a one-unit change in x will be b1units.
D) The average change in x of a one-unit change in y will be b1units.
Q:
Which of the following statements is true?
A) When using a simple linear regression analysis model for prediction purposes, the potential error in the forecast will be less when the value of x used to forecast y is closer to .
B) The accuracy of the regression forecast is improved if the standard error for the regression slope coefficient is reduced.
C) The use of regression analysis as a means of predicting the value for a dependent variable is not impacted by sampling error since the regression model uses all sample data to arrive at the regression model.
D) None of the above
Q:
If you were going to develop a scatter plot for the purpose of determining whether one of the assumptions of the regression model is being satisfied, which of the following is true?
A) The plot should illustrate a bell-shaped distribution to show that the residuals are normally distributed.
B) The horizontal axis should show the fitted values for the dependent variable.
C) The plot should illustrate a cone shaped look.
D) The points should fall in a straight line.
Q:
Residual analysis is conducted to check whether regression assumptions are met. Which of the following is not an assumption made in simple linear regression?
A) Errors are independent of each other.
B) Errors are normally distributed.
C) Errors are linearly related to x.
D) Errors have constant variance.
Q:
In analyzing the residuals to determine whether the simple regression analysis satisfies the regression assumptions, which of the following is true?
A) The histogram of the residuals should be approximately bell-shaped.
B) The scatter plot of the residuals against the dependent variable should illustrate that the variation in residuals is the same over all levels of .
C) Neither A nor B are true.
D) Both A and B are true.
Q:
A study was recently performed by the Internal Revenue Service to determine how much tip income waiters and waitresses should make based on the size of the bill at each table. A random sample of bills and resulting tips were collected. These data are shown as follows: Total Bill
Tip $126
$19 $58
$11 $86
$20 $20
$3 $59
$14 $120
$30 $14
$2 $17
$4 $26
$2 $74
$16 Based upon these data, what is the approximate predicted value for tips if the total bill is $100?
A) $15.55
B) $20.61
C) $26.03
D) $12.88
Q:
Which of the following statements is true?
A) The interval estimate for predicting a particular value of y given a specific x will be narrower than the interval estimate for the average value of y given a particular x.
B) The higher the r-square value, the wider will be the prediction interval based on a simple linear regression model.
C) The prediction interval generated from a simple linear regression model will be narrowest when the value of x used to generate the predicted y value is close to the mean value of x.
D) The prediction interval generated from a simple linear regression model will be widest when the value of x used to generate the predicted y value is close to the mean value of x.
Q:
Assume that you have calculated a prediction of = 110 where the specific value for x is equal to the average value of x. Also assume that n = 201 and that the standard error of the estimate is sε= 4.5. Find the approximate 95 percent prediction interval.
A) About 101 ----- 119
B) About 109.4 ----- 110.6
C) About 105.5 ----- 104.5
D) About 98.4 ----- 121.6
Q:
It is believed that number of people who attend a Mardi Gras parade each year depends on the temperature that day. A regression has been conducted on a sample of years where the temperature ranged from 28 to 64 degrees and the number of people attending ranged from 8400 to 14,600. The regression equation was found to be = 2378 + 191x. Which of the following is true?
A) The average change in parade attendance is an additional 2378 people per one-degree increase in temperature.
B) The average change in parade attendance is an additional 191 people per one-degree increase in temperature.
C) If the temperature is 75 degrees, we can expect that 16,703 people will attend.
D) If the temperature is 0 degrees this year, then we should expect 2378 people to attend.
Q:
The following regression output was generated based on a sample of utility customers. The dependent variable was the dollar amount of the monthly bill and the independent variable was the size of the house in square feet.Based on this regression output, what is the 95 percent confidence interval estimate for the population regression slope coefficient?A) Approximately -0.0003 ----- +0.0103B) About -0.0082 ----- +0.0188C) Approximately -32.76 ----- +32.79D) None of the above
Q:
The National Football League has performed a study in which the total yards gained by teams in games was used as an independent variable to explain the variation in total points scored by teams during games. The points scored ranged from 0 to 57 and the yards gained ranged from 187 to 569. The following regression model was determined: Given this model, which of the following statements is true?A) The average points scored for teams who gain zero yards during a game is -12.3 points.B) The average yards gained will increase by .12 for every additional point scored.C) The average change in points scored for each increase of one yard will be 0.12D) The average number of points scored per game is 12.3
Q:
When using regression analysis for descriptive purposes, which of the following is of importance?
A) The size of the regression slope coefficient
B) The sign of the regression slope coefficient
C) The standard error of the regression slope coefficient
D) All of the above
Q:
Given the data below, one ran the simple regression analysis of Y on X. YX4231446385 The relationship between Y and X isA) significant at the alpha = 1 percent level.B) significant at the alpha = 5 percent level.C) significant at the alpha = 10 percent level.D) not significant at the alpha = 10 percent level.
Q:
A regression analysis between sales (Y) and advertising (X) (both in dollars) resulted in the following equation:The above equation implies that anA) increase of $1 in advertising is correlated with an increase of $2,000 in sales.B) increase of $1 in advertising is correlated with an increase of $2 in sales.C) increase of $1 in advertising is correlated with an increase of $100 in sales.D) increase of $1 in advertising is correlated with an increase of $2100 in sales.
Q:
Which of the following statements is true in simple linear regression?
A) The standard error of the estimate is equal to the standard error of the slope.
B) The total degrees of freedom are (n-2).
C) The coefficient of determination is equal to the correlation of x and y.
D) The p-value of the F test will equal the p-value of the t-test of the slope.
Q:
Consider the following partially completed computer printout for a regression analysis where the dependent variable is the price of a personal computer and the independent variable is the size of the hard drive.Based on the information provided, which of the following statements is true if alpha = .05?A) The slope is not significantly different from 0 because p-value = 0.84 is greater than 0.05B) The slope is significantly different from 0 because p-value = 9.95 is greater than 0.05C) The slope is not significantly different from 0 because p-value = 9.95 is greater than 0.05D) The slope is significantly different from 0 because p-value = 9.95En - 10 is less than 0.05
Q:
Assuming that a regression has been conducted for a group of small companies where x = the number of employees at the company, y = annual revenue of the company (recorded in thousands of dollars), and the largest company included in the study had 82 employees. The resulting regression equation is = 59.2 + 83.4x. Which of the following is true?
A) For each additional employee, revenue on average will increase by $83.4
B) A company with 2100 employees could be predicted to have average revenue of about $175 million.
C) For each additional employee, revenue on average will increase by $59.2 thousand.
D) This model should not be used to make predictions for companies with more than 82 employees.
Q:
A recent study by a major financial investment company was interested in determining whether the annual percentage change in stock price for companies is linearly related to the annual percent change in profits for the company. The following data was determined for 7 randomly selected companies: % Change Stock Price
% Change in Profit 8.4
4.2 9.5
5.6 13.6
11.2 -3.2
4.5 7
12.2 18.4
12 -2.1
-13.4 Based upon this sample information, which of the following is the regression equation?
A) = 4.19 + .61x
B) = 15.04 + 4.25x
C) = 1.19 - 3.00x
D) = 20.19 + .005x
Q:
In a regression analysis situation, the standard error of the slope is:
A) a measure of the variation in the regression slope from sample to sample.
B) equal to the square root of the standard error of the estimate.
C) a measure of the amount of change in y that will occur for a one-unit change in x.
D) All of the above
Q:
Which of the following statements is true with respect to a simple linear regression model?
A) The regression slope coefficient is the square of the correlation coefficient.
B) The percentage of variation in the dependent variable that is explained by the independent variable can be determined by squaring the correlation coefficient.
C) It is possible that the correlation between a y and x variable might be statistically significant, but the regression slope coefficient could be determined to be zero since they measure different things.
D) The standard error of the estimate is equal to the standard error of the slope.
Q:
The following regression output is available. Notice that some of the values are missing.Given this information, what is the standard error of the estimate for the regression model?A) About 36.18B) Approximately 6.02C) About 1.98D) 3.91
Q:
Use the following regression results to answer the question below.In conducting a hypothesis test of the slope using a 0.05 level of significance, which of the following is correct?A) The slope differs significantly from 0 because p-value = 0.205 is greater than 0.05B) The slope does not differ significantly from 0 because p-value = 0.205 is greater than 0.05C) The slope differs significantly from 0 because p-value = 0.003 is less than 0.05D) The slope does not differ significantly from 0 because p-value = 0.003 is less than 0.05
Q:
Use the following regression results to answer the question below.How many observations were involved in this regression?A) 7B) 8C) 9D) 10
Q:
Use the following regression results to answer the question below.Which of the following is true?A) x explains about 88.5 percent of the variation in y.B) y explains about 88.5 percent of the variation in x.C) x explains about 78.4 percent of the variation in y.D) y explains about 78.4 percent of the variation in x.
Q:
Which of the following is NOT an assumption for the simple linear regression model?
A) The individual error terms are statistically independent.
B) The distribution of the error terms will be skewed left or right depending on the shape of the dependent variable.
C) The error terms have equal variances for all values of the independent variable.
D) The mean of the dependent variable value for all levels of x can be connected by a straight line.
Q:
A study was done in which the high daily temperature and the number of traffic accidents within the city were recorded. These sample data are shown as follows: High Temperature
Traffic Accidents 91
7 56
4 75
9 68
11 50
3 39
5 98
8 Given this data the sample correlation is:
A) -0.57
B) 0.64
C) 1.54
D) 0.57
Q:
A recent study of 15 shoppers showed that the correlation between the time spent in the store and the dollars spent was 0.235. Using a significance level equal to 0.05, which of the following is true?
A) The null hypothesis that the population mean is equal to zero should be rejected and we should conclude that the true correlation is not equal to zero.
B) Based on the sample data there is not enough evidence to conclude that the true correlation is different from zero.
C) The sample correlation coefficient could be zero since the test statistic does not fall in the rejection region.
D) The null hypothesis should be rejected because the test statistic exceeds the critical value.
Q:
The term that is given when two variables are correlated but there is no apparent connection between them is:
A) spontaneous correlation.
B) random correlation.
C) spurious correlation.
D) linear correlation.
Q:
Assume that a medical research study found a correlation of -0.73 between consumption of vitamin A and the cancer rate of a particular type of cancer. This could be interpreted to mean:
A) the more vitamin A consumed, the lower a person's chances are of getting this type of cancer.
B) the less vitamin A consumed, the lower a person's chances are of getting this type of cancer.
C) the more vitamin A consumed, the higher a person's chances are of getting this type of cancer.
D) vitamin A causes this type of cancer.
Q:
Recently, an automobile insurance company performed a study of a random sample of 15 of its customers to determine if there is a positive relationship between the number of miles driven and the age of the driver. The sample correlation coefficient is r = .38. Given this information, and assuming that the test is to be performed at the .05 level of significance, which of the following is the correct test statistic?
A) t = 1.4812
B) t = 1.7709
C) z = 2.114
D) t = 1.74
Q:
Which of the following statements is correct?
A) A scatter plot showing two variables with a positive linear relationship will have all points on a straight line.
B) The stronger the linear relationship between two variables, the closer the correlation coefficient will be to 1.0.
C) Two variables that are uncorrelated with one another may still be related in a nonlinear manner.
D) All of the above are correct.
Q:
If a sample of n = 30 people is selected and the sample correlation between two variables is r = 0.468, what is the test statistic value for testing whether the true population correlation coefficient is equal to zero?
A) About t = 2.80
B) About t = -.3.01
C) t = 2.0484
D) Can't be determined without knowing the level of significance for the test.
Q:
If a pair of variables have a strong curvilinear relationship, which of the following is true?
A) The correlation coefficient will be able to indicate that curvature is present.
B) A scatter plot will not be needed to indicate that curvature is present.
C) The correlation coefficient will not be able to indicate the relationship is curved.
D) The correlation coefficient will be equal to zero.
Q:
If the population correlation between two variables is determined to be -0.70, which of the following is known to be true?
A) There is a positive linear relationship between the two variables.
B) There is a fairly strong negative linear relationship between the two variables.
C) An increase in one of the variables will cause the other variable to decline by 70 percent.
D) The scatter diagram for the two variables will be upward sloping from left to right.
Q:
In analyzing the relationship between two variables, a scatter plot can be used to detect which of the following?
A) A positive linear relationship
B) A curvilinear relationship
C) A negative linear relationship
D) All of the above
Q:
A high coefficient of determination (R2) implies that the regression model will be a good predictor for future values of the dependent variable given the value of the independent variable.
Q:
A regression model that is deemed to have a regression slope coefficient that could be equal to zero should not be used for prediction since there is no established linear relationship between the x and y variable.
Q:
Given the following regression equation, the predicted value for y when x = 0.5 is about 4.57
Q:
A manufacturing company is interested in predicting the number of defects that will be produced each hour on the assembly line. The managers believe that there is a relationship between the defect rate and the production rate per hour. The managers believe that they can use production rate to predict the number of defects. The following data were collected for 10 randomly selected hours. Defects
Production Rate Per Hour 20
400 30
450 10
350 20
375 30
400 25
400 30
450 20
300 10
300 40
300 Given these sample data, the simple linear regression model for predicting the number of defects is approximately = 5.67 + 0.048x.
Q:
The prediction interval developed from a simple linear regression model will be at its narrowest point when the value of x used to predict y is equal to the mean value of x.
Q:
When the intercept in a regression equation is deemed not significantly different from 0, then in making predictions for y, 0.0 should be used as the value of the intercept rather than the estimated intercept value.
Q:
If the R-squared value for a regression model is high, the regression model will necessarily provide accurate forecasts of the y variable.
Q:
When calculating prediction intervals for predicted values of y based on a given x, all 95 percent prediction intervals will be of equal width.
Q:
When regression analysis is used for descriptive purposes, two of the main items of interest are whether the sign on the regression slope coefficient is positive or negative and whether the regression slope coefficient is significantly different from zero.
Q:
A positive population slope of 12 (β1= 12) means that a 1-unit increase in x causes an average 12-unit increase in y.
Q:
In simple linear regression, the t-test for the slope and the F-test are both conducting the same hypothesis test.
Q:
In a simple linear regression analysis, if the test statistic for testing the significance of the regression slope coefficient is 3.6, the F ratio from the analysis of variance table is known to be 12.96
Q:
If the R-square value for a simple linear regression model is .80, the correlation between the two variables is known to be .64.
Q:
The values of the regression coefficients are found such the sum of the residuals is minimized.
Q:
If a simple least squares regression model is developed based on a sample where the two variables are known to be positively correlated, the sum of the residuals will be positive.
Q:
State University recently randomly sampled seven students and analyzed grade point average (GPA) and number of hours worked off-campus per week. The following data were observed: GPA
HOURS 3.14
25 2.75
30 3.68
11 3.22
18 2.45
22 2.80
40 3.00
15 2.23
29 3.14
10 2.90
0 In testing the significance of the regression slope coefficient for the independent variable, HOURS, the calculated test statistic is approximately t = -1.47.
Q:
Assume that we have found a regression equation of = 3.6 - 2.4x, and that the coefficient of determination is 0.72, then the correlation of x and y must be about 0.849.
Q:
In a simple regression model, if the regression model is deemed to be statistically significant, it means that the regression slope coefficient is significantly greater than zero.
Q:
If the correlation between the dependent variable and the independent variable is negative, the standard error of the regression slope coefficient in a simple linear regression model will also be negative.
Q:
If the correlation of x and y is -0.65, then coefficient of determination is -0.4225.
Q:
The standard error of the estimate for a simple linear regression model measures the variation in the slope coefficient from sample to sample.