Accounting
Anthropology
Archaeology
Art History
Banking
Biology & Life Science
Business
Business Communication
Business Development
Business Ethics
Business Law
Chemistry
Communication
Computer Science
Counseling
Criminal Law
Curriculum & Instruction
Design
Earth Science
Economic
Education
Engineering
Finance
History & Theory
Humanities
Human Resource
International Business
Investments & Securities
Journalism
Law
Management
Marketing
Medicine
Medicine & Health Science
Nursing
Philosophy
Physic
Psychology
Real Estate
Science
Social Science
Sociology
Special Education
Speech
Visual Arts
Computer Science
Q:
When using the moving average method, you must select _____which represent(s) the number of terms in the moving average.
a. a smoothing constant
b. the explanatory variables
c. an alpha value
d. a span
Q:
The moving average method can also be referred to as a (n) _____ method.
a. causal
b. smoothing
c. exponential
d. econometric
Q:
Perhaps the simplest and one of the most frequently used extrapolation methods is the:
a. moving average
b. linear trend
c. exponential trend
d. causal model
Q:
In a random walk model the
a. series itself is random
b. series itself is not random but its differences are random
c. series itself and its differences are random
d. series itself and its differences are not random
Q:
The random walk model is written as: . In this model, represents the:
a. average of the Y's
b. average of the X's
c. forecasted value
d. random series with mean 0 and some constant standard deviation
Q:
In contrast to linear trend, exponential trend is appropriate when the time series changes by a:
a. constant amount each time period
b. constant percentage each time period
c. positive amount each time period
d. negative amount each time period
Q:
The linear trend was estimated using a time series with 20 time periods. The forecasted value for time period 21 is
a. 120
b. 122
c. 160
d. 162
Q:
A linear trend means that the time series variable changes by a:
a. constant amount each time period
b. constant percentage each time period
c. positive amount each time period
d. negative amount each time period
Q:
The most common form of autocorrelation is positive autocorrelation, in which:
a. large observations tend to follow both large and small observations
b. small observations tend to follow both large and small observations
c. large observations tend to follow large observations and small observations tend to follow small observations
d. large observations tend to follow small observations and small observations tend to follow large observations
Q:
In a random series, successive observations are probabilistically independent of one another. If this property is violated, the observations are said to be:
a. autocorrelated
b. intercorrelated
c. causal
d. seasonal
Q:
Related to the runs test, if T is reasonably large (T > 20 is suggested), then the statistic can be used to perform this test.
a. F
b. t
c. Z
d.
Q:
The idea behind the runs test is that a random number series should have a number of runs that is:
a. large
b. small
c. not large or small
d. constant
Q:
The runs test uses a series of 0's and 1's. The 0's and 1's represent whether each observation is:
a. above or below the predicted value of Y
b. above or below the mean value of Y
c. is above or below the mean value of the previous two observations
d. is positive or negative
Q:
Related to the runs test, if you use a Z-statistic and you get a Z value greater than 2.0, this means that there is evidence of in the series
a. randomness
b. nonrandomness
c. nonnormality
d. heteroscedasticity
Q:
Which of the following is not one of the techniques that can be used to identify whether a time series is truly random?
a. A graph (plot the data)
b. The runs test
c. A control chart
d. The autocorrelations (or a correlogram)
Q:
Examples of non-random patterns that may be evident on a time series graph include:
a. trends
b. increasing variance over time
c. a meandering pattern
d. too many zigzags
e. all of these options
Q:
Which of the following summary measures for forecast errors does not depend on the units of the forecast variable?
a. MAE (mean absolute error)
b. MFE (mean forecast error)
c. RMSE (root mean square error)
d. MAPE (mean absolute percentage error)
Q:
Which of the following is not one of the summary measures for forecast errors that is commonly used?
a. MAE (mean absolute error)
b. MFE (mean forecast error)
c. RMSE (root mean square error)
d. MAPE (mean absolute percentage error)
Q:
The forecast error is the difference between
a. this period's value and the next period's value
b. the average value and the expected value of the response variable
c. the explanatory variable value and the response variable value
d. the actual value and the forecast
Q:
The components of a time series include:
a. base series
b. trend
c. seasonal component
d. cyclic component
e. all of these options
Q:
Models such as moving average, exponential smoothing, and linear trend use only:
a. future values of Y to forecast previous values of Y
b. previous values of Y to forecast future values of Y
c. multiple explanatory variables (not just values of Y) to forecast future values of Y
d. ratio-to-moving-average methods
Q:
Econometric models can also be called:
a. judgmental models
b. time series models
c. causal models
d. environmetric models
Q:
Extrapolation methods attempt to:
a. use non-quantitative methods to predict future values
b. search for patterns in the data and then use those to predict future values
c. find variables that are correlated with the data being predicted
d. predict the next period's value by using the latest period's value
Q:
Forecasting models can be divided into three groups. They are:
a. time series, optimization, and simulation methods
b. judgmental, extrapolation, and econometric methods
c. judgmental, random, and linear methods
d. linear, non-linear, and extrapolation methods
Q:
NARRBEGIN: SA_94_99A local truck rental company wants to use regression to predict the yearly maintenance expense (Y), in dollars, for a truck using the number of miles driven during the year and the age of the truck in years at the beginning of the year. To examine the relationship, the company has gathered the data on 15 trucks and regression analysis has been conducted. The regression output is presented below.Summary measuresMultiple R0.9308R-Square0.8665Adj R-Square0.8442StErr of Estimate87.397ANOVA Table SourcedfSSMSFp-valueExplained259469029734538.92870.0000Unexplained12916587638 Regression coefficients CoefficientStd Errt-valuep-value Constant-680.70161.27-4.22100.0012 Miles Driven0.0800.0155.18310.0002 Age of Truck44.23810.4444.23590.0012 NARREND(A) Estimate the regression model. How well does this model fit the given data?(B) Is there a linear relationship between the two explanatory variables and the dependent variable at the 5% significance level? Explain how you arrived at your answer.(C) Use the estimated regression model to predict the annual maintenance expense of a truck that is driven 14,000 miles per year and is 5 years old.(D) Find a 95% prediction interval for the maintenance expense determined in (C). Use a t-multiple = 2.(E) Find a 95% confidence interval for the maintenance expense for all trucks sharing the characteristics provided in Question 96. Use a t-multiple = 2.(F) How do you explain the differences between the widths of the intervals in (D) and (E)?
Q:
NARRBEGIN: SA_88_93A company that makes baseball caps would like to predict the sales of it main product, standard little league caps. The company has gathered data on monthly sales of caps at all of its retail stores, along with information related to the average retail price, which varies by location. Below you will find regression output comparing these two variables.Summary measuresMultiple R0.5892R-Square0.3472StErr of Estimate10283.97ANOVA table SourcedfSSMSFp-valueExplained18998256008998256008.50820.0101Unexplained161692159250105759953 Regression coefficients CoefficientStd Errt-valuep-value Constant147984.4428831.215.13280.0001 Average Price-7370.942527.00-2.91690.0101 NARREND(A) Estimate the regression model. How well does this model fit the given data?(B) Is there a linear relationship between X and Y at the 5% significance level? Explain how you arrived at your answer.(C) Use the estimated regression model to predict the number of caps that will be sold during the next month if the average selling price is $10.(D) Find a 95% prediction interval for the number of caps determined in Question 90. Use t- multiple = 2.(E) Find a 95% confidence interval for the average number of caps sold given an average selling price of $10. Use a t-multiple = 2.(F) How do you explain the differences between the widths of the intervals in (D) and (E)?
Q:
NARRBEGIN: SA_82_87An internet-based retail company that specializes in audio and visual equipment is interested in creating a model to determine the amount of money, in dollars, its customers will spend purchasing products from them in the coming year. In order to create a reliable model, this company has tracked a number of variables on its customers. Below you will find the Excel output related to several of these variables. This company has tried using the customer's annual salary for entire household, the number of children in the household, and if the customer purchased merchandise from them in the previous year ( in 2004).Summary measuresMultiple R0.7825R-Square0.6122Adj R-Square0.5852StErr of Estimate541.70ANOVA Table SourcedfSSMSFp-valueExplained319921803664060122.63030.0000Unexplained4312617877293439 Regression coefficients CoefficientStd Errt-valuep-value Constant291.243193.8401.50250.1403 Salary0.0260.0038.01820.0000 Number of children-331.97279.725-4.16400.0001 Purchase in 2004281.80133.822.10580.0411 NARREND(A) Estimate the regression model. How well does this model fit the data?(B) Is there a linear relationship between the explanatory variables and the dependent variable? Explain how you arrived at your answer at the 5% significance level.(C) Use the estimated regression model to predict the amount of money a customer will spend if their annual salary is $45,000, they have 1 child and they were a customer that purchased merchandise in the previous year (2004).(D) Find a 95% prediction interval for the point prediction calculated in (C). Use a t-multiple = 2.02.(E) Find a 95% confidence interval for the amount of money spent by all customers sharing the characteristics described in (C). Use a t-multiple = 2.02.(F) How do you explain the differences between the widths of the intervals in (D) and (E)?
Q:
NARRBEGIN: SA_79_81The information below represents the relationship between the selling price (Y, in $1,000) of a home, the square footage of the home (), and the number of rooms in the home (). The data represents 60 homes sold in a particular area of East Lansing, Michigan and was analyzed using multiple linear regression and simple regression for each independent variable. The first two tables relate to the multiple regression analysis.Summary measuresMultiple R0.941R-Square0.885Adj R-Square0.866StErr of Estimate20.84Regression coefficientsCoefficientConstant-13.9705Size7.4336Number of Rooms5.3055The following table is for a simple regression model using only size. (= 0.8812) CoefficientStd Errt-valuep-valueConstant14.77119.6910.75020.4665Size7.8160.7969.81900.0000The following table is for a simple regression model using only number of rooms. (= 0.3657) CoefficientStd Errt-valuep-valueConstant-93.460108.269-0.86320.4037Number of Rooms41.29215.0822.73790.0169NARREND(A) Use the information related to the multiple regression model to determine whether each of the regression coefficients are statistically different from 0 at a 5% significance level. Summarize your findings.(B) Test at the 5% significance level the relationship between Y and X in each of the simple linear regression models. How does this compare to your answer in (A)? Explain.(C) Is there evidence of multicollinearity in this situation? Explain why or why not.
Q:
NARRBEGIN: SA_76_78The owner of a pizza restaurant chain would like to predict the sales of her specialty, the deep-dish Mexican pizza. She has gathered data on monthly sales of the deep-dish Mexican pizza at her restaurants. She has also gathered information related to the average price of the deep-dish pizzas, the monthly advertising expenditures and the disposable income per household in the areas surrounding the restaurants. Below you will find output from the stepwise regression analysis. The p-value method was used with a cutoff of 0.05.Summary measures Multiple R 0.9513R-Square 0.9049Adj R-Square 0.8990StErr of Estimate3924.53Regression coefficients CoefficientStd Errt-valuep-valueConstant-45233.648914.72-5.07400.0001Monthly Adv. Expenditures1.9720.16012.34050.0000NARREND(A) Summarize the findings of the stepwise regression method using this cutoff value.(B) When the cutoff value was increased to 0.10, the output below was the result. The table at top left represents the change when the disposable income variable is added to the model and the table at top right represents the average price variable being added. The regression model with both added variables is shown in the bottom table. Summarize the results for this model.Disposable income variable being addedSummary measures% ChangeMultiple R0.96081.00%R-Square0.92322.00%Adj R-Square0.9131.60%StErr of Estimate3643.11-7.20%Average price variable being addedSummary measures% ChangeMultiple R0.97231.20%R-Square0.94542.40%Adj R-Square0.93372.30%StErr of Estimate3179.03-12.70%Regression coefficientsCoefficientStd Errt-valuep-valueConstant-73971.5323803.23-3.10760.0077Monthly Adv. Expenditures0.9520.3752.53870.0236Disposable Income2.6060.9772.66590.0184Average Price-2056.27861.342-2.38730.0316(C) Which model would you recommend using? Why?
Q:
Below you will find a scatterplot of data gathered by a mail-order company. The company has been able to obtain the annual salaries of their customers and the amount that each of these customers spent with the company in 1998. Based on the scatterplot below, would you conclude that these data meet all four assumptions of regression? Explain your answer.
Q:
NARRBEGIN: SA_71_74A carpet company, which sells and installs carpet, believes that there should be a relationship between the number of carpet installations that they will have to perform in a given month and the number of building permits that have been issued within the county where they are located. Below you will find a regression model that compares the relationship between the number of monthly carpet installations (Y) and the number of building permits that have been issued in a given month (X). The data represents monthly values for the past 10 months.Summary measuresMultiple R0.5682R-Square0.3229StErr of Estimate9603.23ANOVA table SourcedfSSMSFp-valueExplained13518244793518244793.81500.0866Unexplained873777552192221940 Regression coefficients CoefficientStd Errt-valuep-valueConstant-115076.6982933.46-1.38760.2027Permits53.46927.3751.95320.0866NARREND(A) Estimate the regression model. How well does this model fit the given data?(B) Yes, there is a linear relationship between the number of carpet installations and the number of building permits issued at a = 0.10; The p-value = 0.0866 for the F-statistic. You can conclude that there is a significant linear relationship between these two variables.(C) The Durbin-Watson statistic for this data was 1.2183. Given this information what would you conclude about the data?(D) Given your answer in (C), would you recommend modifying the original regression model? If so, how would you modify it?
Q:
A confidence interval constructed around a point prediction from a regression model is called a prediction interval, because the actual point being estimated is not a population parameter
Q:
In order to estimate with 90% confidence a particular value of Y for a given value of X in a simple linear regression problem, a random sample of 20 observations is taken. The appropriate t-value that would be used is 1.734.
Q:
The Durbin-Watson statistic can be used to measure of autocorrelation.
Q:
One method of dealing with heteroscedasticity is to try a logarithmic transformation of the data.
Q:
One method of diagnosing heteroscedasticity is to plot the residuals against the predicted values of Y, then look for a change in the spread of the plotted values.
Q:
One of the potential characteristics of an outlier is that the value of the dependent variable is much larger or smaller than predicted by the regression line.
Q:
If the partial F test indicates that a group of variables is significant, it also implies that each variable in the group is significant.
Q:
The partial F test is a procedure to determine whether extra variables in a group provide any extra explanatory power in the regression equation
Q:
A forward procedure is a type of equation building procedure that begins with only one explanatory variable in the regression equation and successively adds one variable at a time until no remaining variables make a significant contribution.
Q:
A backward procedure is a type of equation building procedure that begins with all potential explanatory variables in the regression equation and deletes them two at a time until further deletion would reduce the percentage of variation explained to a value less than 0.50.
Q:
In multiple regression, the problem of multicollinearity affects the t-tests of the individual coefficients as well as the F-test in the analysis of variance for regression, since the F-test combines these t-tests into a single test.
Q:
In multiple regression, if there is multicollinearity between independent variables, the t-tests of the individual coefficients may indicate that some variables are not linearly related to the dependent variable, when in fact they are.
Q:
When there is a group of explanatory variables that are in some sense logically related, all of them must be included in the regression equation.
Q:
Multicollinearity is a situation in which two or more of the explanatory variables are highly correlated with each other.
Q:
In order to test the significance of a multiple regression model involving 4 explanatory variables and 40 observations, the numerator and denominator degrees of freedom for the critical value of F are 4 and 35, respectively.
Q:
Suppose that one equation has 3 explanatory variables and an F-ratio of 49. Another equation has 5 explanatory variables and an F-ratio of 38. The first equation will always be considered a better model.
Q:
In a multiple regression problem involving 30 observations and four explanatory variables, SST = 800 and SSE = 240. The value of the F-statistic for testing the significance of this model is 14.583.
Q:
In multiple regressions, a large value of the test statistic F indicates that most of the variation in Y is unexplained by the regression equation and that the model is useless. A small value of F indicates that most of the variation in Y is explained by the regression equation and that the model is useful.
Q:
A multiple regression model involves 40 observations and 4 explanatory variables produces SST = 1000 and SSR = 804. The value of MSE is 5.6.
Q:
The value of the sum of squares due to regression, SSR, can never be larger than the value of the sum of squares total, SST.
Q:
In regression analysis, the total variation in the dependent variable Y, measured by and referred to as SST, can be decomposed into two parts: the explained variation, measured by SSR, and the unexplained variation, measured by SSE.
Q:
In regression analysis, the unexplained part of the total variation in the response variable Y is referred to as sum of squares due to regression, SSR.
Q:
In a simple linear regression problem, if the standard error of estimate = 15 and n = 8, then the sum of squares for error, SSE, is 1,350.
Q:
In a multiple regression analysis involving 4 explanatory variables and 40 data points, the degrees of freedom associated with the sum of squared errors, SSE, is 35.
Q:
The residuals are observations of the error variable . Consequently, the minimized sum of squared deviations is called the sum of squared error, labeled SSE.
Q:
In testing the overall fit of a multiple regression model in which there are three explanatory variables, the null hypothesis is .
Q:
In multiple regression with k explanatory variables, the t-tests of the individual coefficients allows us to determine whether (for i = 1, 2, ...., k), which tells us whether a linear relationship exists between and Y.
Q:
In simple linear regression, if the error variable is normally distributed, the test statistic for testing is t-distributed with n " 2 degrees of freedom.
Q:
In a simple linear regression model, testing whether the slope of the population regression line could be zero is the same as testing whether or not the linear relationship between the response variable Y and the explanatory variable X is significant.
Q:
Multiple regression represents an improvement over simple regression because it allows any number of response variables to be included in the analysis.
Q:
If exact multicollinearlity exists, that means that there is redundancy in the data.
Q:
In time series data, errors are often not probabilistically independent.
Q:
In regression analysis, homoscedasticity refers to constant error variance.
Q:
The assumptions of regression are: 1) there is a population regression line, 2) the dependent variable is normally distributed, 3) the standard deviation of the response variable remains constant as the explanatory variables increase, and 4) the errors are probabilistically independent.
Q:
Suppose you forecast the values of all of the independent variables and insert them into a multiple regression equation and obtain a point prediction for the dependent variable. You could then use the standard error of the estimate to obtain an approximate
a. confidence interval
b. prediction interval
c. hypothesis test
d. independence test
Q:
In regression analysis, extrapolation is performed when you:
a. attempt to predict beyond the limits of the sample
b. have to estimate some of the explanatory variable values
c. have to use a lag variable as an explanatory variable in the model
d. don"t have observations for every period in the sample
Q:
If residuals separated by one period are autocorrelated, this is called:
a. simple autocorrelation
b. redundant autocorrelation
c. time 1 autocorrelation
d. lag 1 autocorrelation
Q:
The _____ can be used to test for autocorrelation.
a. regression coefficient
b. correlation coefficient
c. Durbin-Watson statistic
d. F-test or t-test
Q:
A researcher can check whether the errors are normally distributed by using:
a. a t-test or an F-test
b. the Durbin-Watson statistic
c. a frequency distribution or the value of the regression coefficient
d. a histogram or a Q-Q plot
Q:
When the error variance is nonconstant, it is common to see the variation increases as the explanatory variable increases (you will see a "fan shape" in the scatterplot). There are two ways you can deal with this phenomenon. These are:
a. the weighted least squares and a logarithmic transformation
b. the partial F and a logarithmic transformation
c. the weighted least squares and the partial F
d. stepwise regression and the partial F
Q:
A point that "tilts" the regression line toward it, is referred to as a(n):
a. magnetic point
b. influential point
c. extreme point
d. explanatory point
Q:
If you can determine that the outlier is not really a member of the relevant population, then it is appropriate and probably best to:
a. average it
b. reduce it
c. delete it
d. leave it
Q:
Which of the following would be considered a definition of an outlier?
a. An extreme value for one or more variables
b. A value whose residual is abnormally large in magnitude
c. Values for individual explanatory variables that fall outside the general pattern of the other observations
d. All of these options
Q:
There are situations where a set of explanatory variables forms a logical group. The test to determine whether the extra variables provide enough extra explanatory power to warrant inclusion in the equation is referred to as the:
a. complete F-test
b. reduced F-test
c. partial F-test
d. reduced t-test
Q:
Forward regression:
a. begins with all potential explanatory variables in the equation and deletes them one at a time until further deletion would do more harm than good.
b. adds and deletes variables until an optimal equation is achieved.
c. begins with no explanatory variables in the equation and successively adds one at a time until no remaining variables make a significant contribution.
d. randomly selects the optimal number of explanatory variables to be used
Q:
The objective typically used in the tree types of equation-building procedures are to:
a. find the equation with a small se
b. find the equation with a large R2
c. find the equation with a small se and a large R2
d. find the equation with the largest F-statistic
Q:
Many statistical packages have three types of equation-building procedures. They are:
a. forward, linear and non-linear
b. forward, backward and stepwise
c. simple, complex and stepwise
d. inclusion, exclusion and linear
Q:
Determining which variables to include in regression analysis by estimating a series of regression equations by successively adding or deleting variables according to prescribed rules is referred to as:
a. elimination regression
b. forward regression
c. backward regression
d. stepwise regression
Q:
When determining whether to include or exclude a variable in regression analysis, if the p-value associated with the variable's t-value is above some accepted significance value, such as 0.05, then the variable:
a. is a candidate for inclusion
b. is a candidate for exclusion
c. is redundant
d. not fit the guidelines of parsimony