Data Analysis and Decision Modelling

Assignment Requirements

 

So basically I have the Excel chart and the example assignment from my friend. What I need you to do is to use the Excel chart data and apply it into my friend’s assignment. Not to copy just use your own word and make it like a new work but use my friend’s assignment as reference. and Probably add more definitions for it.

So to sum up. All the numbers/data you will need to use from the excel chart and writing style I want you to follow my friend’s assignment as reference.

I have only 12hours. so please let me know if anything you dont understand now cuz I’ll be sleeping in 2hours(3am) cuz it’s 1am now.

I will upload all the document for you.

Introduction

The purpose of this essay is going to illustrate that relationship and interactions between variable factors and regression model, and then predicting future changes of company – Apple’s share price in share market. This report is constructed by data collecting, distinguishing and built up a multiple regression model to analysing information and graphs, such as P value, VIF and F test etc.

 

Original Data:

Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
12/11/2013 0.021007 0.039063 0.182188 0.103333 0.291806 0.2625
11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667
8/11/2013 0.08724 0.015625 0.146979 0.234896 0.372431 0.395833
7/11/2013 0.034236 0.012708 0.110677 0.360972 0.045833 0.658333
6/11/2013 0.076563 0.197777 0.095 0.475694 0.331459 0.320833
5/11/2013 0.054861 0.235989 0.011458 0.375 0.335313 0.325
4/11/2013 0.125417 0.393021 0.014236 0.155694 0.473611 0.254167
1/11/2013 0.079132 0.5875 0.038889 0.055087 0.516076 0.541667
31/10/2013 0.076823 0.60783 0.201562 0.040399 0.539688 0.554167
30/10/2013 0.209792 0.092083 0.196875 0.014844 0.437708 0.445833
29/10/2013 0.068177 0.395416 0.382673 0.022083 0.455381 0.745833
11/02/2013 0.416666 0.129167 0.444618 0.455381 0.119401 0.408333
8/02/2013 0.628334 0.551597 0.189496 0.367362 0.189844 0.6
7/02/2013 0.847084 0.477969 0.148541 0.015625 0.053437 0.916667
6/02/2013 0.320833 0.364653 0.183768 0.012413 0.13125 0.929167
5/02/2013 0.824913 0.322917 0.193958 0.004375 0.305278 0.3
4/02/2013 0.075347 0.362465 0.232916 0.005208 0.195972 0.641667
1/02/2013 0.496875 0.44625 0.008594 0.006094 0.385486 0.475
31/01/2013 0.419966 0.30586 0.019201 0.022656 0.407057 0.820833

 

Those data which applied in this essay are collected from Yahoo finance (Yahoo finance, 2013). Yahoo finance contains a range of statistics, and those data is an open resource to use. Therefore, this report picked up 5 different independent variables from Yahoo finance database to analysing the fluctuation of share price and predicting future changes of share price.

 

 

 

 

Identification of Variables

Y variable

Variable: future change

X variable

Variable 1: Gold_x_AAPL_acc1

The figure shows that relationship between gold and Apple.

 

Variable 2: Silver_vel10_x_Baltic_dry_acc2

The number asserts that raw material and shipping goods has impacted on Apple’s business operation.

 

Variable 3: Oil_vel5_x_AAPL_vel7

The information points out that the relevance between oil prices and the cost of shipping goods worldwide.

 

Variable 4: Baltic_dry_vel5_x_AAPL_vel11

According to the figure to understanding the cost of shipping goods worldwide, such as deliver finish goods from manufactory to markets.

 

Variable 5: 5year_vel4_x_SP500_vel12

By comparing the short term share price with SP500 figure to demonstrates Apple’s performing in the market.

 

Check inputs for collinearity

Collinearity is the undesirable situation where the correlations among the independent variables are strong. Collinearity will misleadingly inflatethe standard errors (Evans, 2013). Thus, it makes some variables statistically may be found not to be significantly different from 0 while they should be otherwise significant.

First of all, we want to see if any of those inputs have collinearity by using PHStat Multiple Regression function to analysis VIF.VIF, variance inflation faction is a figure to understand issue in linear model. When independent variable has got linear issue, then it means there are some similar explanations between independent variables. Moreover, the issue drives researcher could not demonstrate how many degree of impact on Y. Researcher would recognise there is no linear problem when the number of VIF smaller. If VIF for one of the variables is around or greater than 5, there is collinearity associated with that variable, one of these variables must be removed from the regression model (Evans 2013). As can be seen from the case, all 5 independent variable’s VIF are less than 5, therefore, there is no linear issue in this report.

From the result’s outputs show that as below:

 

 

Regression Analysis

5year_vel4_x_SP500_vel12 and all other X
Regression Statistics
Multiple R 0.2601
R Square 0.0677
Adjusted R Square 0.0484
Standard Error 0.2053
Observations 199
VIF 1.0726
Regression Analysis
Baltic_dry_vel5_x_AAPL_vel11 and all other X
Regression Statistics
Multiple R 0.4759
R Square 0.2265
Adjusted R Square 0.2105
Standard Error 0.1770
Observations 199
VIF 1.2928

 

Regression Analysis
Silver_vel10_x_Baltic_dry_acc2 and all other X
Regression Statistics
Multiple R 0.3629
R Square 0.1317
Adjusted R Square 0.1138
Standard Error 0.2020
Observations 199
VIF 1.1517

 

Regression Analysis
Oil_vel5_x_AAPL_vel7 and all other X
Regression Statistics
Multiple R 0.5462
R Square 0.2984
Adjusted R Square 0.2839
Standard Error 0.1985
Observations 199
VIF 1.4253

 

Regression Analysis
Gold_x_AAPL_acc1 and all other X
Regression Statistics
Multiple R 0.1711
R Square 0.0293
Adjusted R Square 0.0093
Standard Error 0.1758
Observations 199
VIF 1.0301

 

From the output table shows that all the VIF are less than 5.

 

 

Check residuals are normally distributed

The other way to check whether the inputs have collinearity, we will check whether the residuals are normally distributed. By using the excel function data analysis we can compute the following graphs.

From the last Graph, it is a straight line (or almost straight), the residuals are normally distributed, and all our hypothesis tests will be accurate.

 

Check which inputs are helping.

In this step, we are going to check for two important figures – adjust R square and P-value.

R square

In the regression model, R square is presenting the proportion of variation which clarifies by independent variables. As the result, R square could be a standard of the degree of accuracy on the prediction between X and Y. The ratio of the regression sum of squares (SSR) to the total sum of squares (SST) is resolved R square’s value. Furthermore, adjusted R square is the number which excludesor reduces impacts from other independent variables.

As can be seen from the regression analysis, the number of adjusted R square is 0.158405, meanwhile, it pointed out that there is a weak relationship between X and Y by 15% only. Therefore, even there may some other elements to effect on variable Y but the outcome would not be different due to the feeble relationship amid Y and Xs. However, the relationship is still statistically significant.

Regression Statistics
Multiple R 0.42386
R Square 0.179658
Adjusted R Square 0.158405
Standard Error 0.249028
Observations 199

 

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 0.497323086 0.041719076 11.92076 6.581E-25 0.415039232 0.579606941 0.415039232 0.579606941
Gold_x_AAPL_acc1 0.18629918 0.101725034 1.8314 0.06858258 -0.014336327 0.386934688 -0.014336327 0.386934688
Silver_vel10_x_Baltic_dry_acc2 0.377818335 0.088513082 4.268503 3.083E-05 0.203241179 0.552395491 0.203241179 0.552395491
Oil_vel5_x_AAPL_vel7 0.108244958 0.090080908 1.201641 0.23097488 -0.069424471 0.285914387 -0.069424471 0.285914387
Baltic_dry_vel5_x_AAPL_vel11 -0.256465655 0.100999 -2.539289 0.01189634 -0.455669182 -0.05726213 -0.455669182 -0.057262129
5year_vel4_x_SP500_vel12 -0.361791706 0.087098442 -4.153825 4.9072E-05 -0.533578722 -0.19000469 -0.533578722 -0.19000469

 

In the output sheet, the Adjusted R square is 0.158405. And we look further down where we have to check each P-value for each input. If input has a p-value that is more than 0.05, then we should consider deleting that whole column of inputs from the original data. As the above table shows that there are two inputs data P- value greater than 0.05, so that we should go back to original data to delete those two columns to have look how Adjust R square changed.

 

Date Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
12/11/2013 0.039063 0.103333 0.291806 0.2625
11/11/2013 0.064445 0.253993 0.301476 0.316667
8/11/2013 0.015625 0.234896 0.372431 0.395833
7/11/2013 0.012708 0.360972 0.045833 0.658333
6/11/2013 0.197777 0.475694 0.331459 0.320833
5/11/2013 0.235989 0.375 0.335313 0.325
4/11/2013 0.393021 0.155694 0.473611 0.254167
1/11/2013 0.5875 0.055087 0.516076 0.541667
31/10/2013 0.60783 0.040399 0.539688 0.554167
30/10/2013 0.092083 0.014844 0.437708 0.445833
29/10/2013 0.395416 0.022083 0.455381 0.745833
11/02/2013 0.129167 0.455381 0.119401 0.408333
8/02/2013 0.551597 0.367362 0.189844 0.6
7/02/2013 0.477969 0.015625 0.053437 0.916667
6/02/2013 0.364653 0.012413 0.13125 0.929167
5/02/2013 0.322917 0.004375 0.305278 0.3
4/02/2013 0.362465 0.005208 0.195972 0.641667
1/02/2013 0.44625 0.006094 0.385486 0.475
31/01/2013 0.30586 0.022656 0.407057 0.820833

 

After delete those two columns, output tables will show as below:

Regression Statistics  
Multiple R 0.399072  
R Square 0.159258  
Adjusted R Square 0.146324  
Standard Error 0.250809  
Observations 199  
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 0.53142673 0.038261011 13.8895108 6.05374E-31 0.455968204 0.606885246 0.455968204 0.606885246
Silver_vel10_x_Baltic_dry_acc2 0.42743504 0.085345172 5.008309548 1.22576E-06 0.259116945 0.595753133 0.259116945 0.595753133
Baltic_dry_vel5_x_AAPL_vel11 -0.206687 0.09149364 -2.2590309 0.024988049 -0.38713109 -0.02624283 -0.38713109 -0.02624283
5year_vel4_x_SP500_vel12 -0.3276067 0.085414441 -3.83549547 0.000169117 -0.49606141 -0.159152 -0.49606141 -0.159152

 

After delete two columns of data Adjusted R square becomes smaller from 0.158405131316566 to 0.146324.Because after delete 2 columns data, adjust R square become smaller. Hence we should un-delete that last column we just deleted. Consider about the change of the Adjust R square, instead of delete two columns in the original data, we are going to delete the column with higher P-value. Table will show as below.

Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
12/11/2013 0.021007 0.039063 0.103333 0.291806 0.2625
11/11/2013 0.027378 0.064445 0.253993 0.301476 0.316667
8/11/2013 0.08724 0.015625 0.234896 0.372431 0.395833
7/11/2013 0.034236 0.012708 0.360972 0.045833 0.658333
6/11/2013 0.076563 0.197777 0.475694 0.331459 0.320833
5/11/2013 0.054861 0.235989 0.375 0.335313 0.325
4/11/2013 0.125417 0.393021 0.155694 0.473611 0.254167
1/11/2013 0.079132 0.5875 0.055087 0.516076 0.541667
31/10/2013 0.076823 0.60783 0.040399 0.539688 0.554167
30/10/2013 0.209792 0.092083 0.014844 0.437708 0.445833
29/10/2013 0.068177 0.395416 0.022083 0.455381 0.745833
11/02/2013 0.416666 0.129167 0.455381 0.119401 0.408333
8/02/2013 0.628334 0.551597 0.367362 0.189844 0.6
7/02/2013 0.847084 0.477969 0.015625 0.053437 0.916667
6/02/2013 0.320833 0.364653 0.012413 0.13125 0.929167
5/02/2013 0.824913 0.322917 0.004375 0.305278 0.3
4/02/2013 0.075347 0.362465 0.005208 0.195972 0.641667
1/02/2013 0.496875 0.44625 0.006094 0.385486 0.475
31/01/2013 0.419966 0.30586 0.022656 0.407057 0.820833

 

Regression Statistics
Multiple R 0.416557415
R Square 0.17352008
Adjusted R Square 0.156479257
Standard Error 0.249313166
Observations 199

Compare with these three Adjust R square, after deleting either two columns or one column, both of the adjust R square results are smaller than the original one. Thus, we have to make a decision that we should not take that action to delete those columns.

 

Make a prediction with a confidence interval

To predict the future change, we will use PHStat’s Confidence interval estimate & Prediction function with 95% confidence intervals. And we will use the original date show below to fill in the table to predict future changes.

Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667

 

Data
Confidence Level 95%
1
Gold_x_AAPL_acc1 given value 0.027378
Silver_vel10_x_Baltic_dry_acc2 given value 0.064445
Oil_vel5_x_AAPL_vel7 given value 0.235156
Baltic_dry_vel5_x_AAPL_vel11 given value 0.253993
5year_vel4_x_SP500_vel12 given value 0.301476

 

For Average Predicted Y (YHat)
Interval Half Width 0.05934
Confidence Interval Lower Limit 0.318675
Confidence Interval Upper Limit 0.437354
   
For Individual Response Y
Interval Half Width 0.494738
Prediction Interval Lower Limit -0.11672
Prediction Interval Upper Limit 0.872753

As can be seen from the graph as above, the analysing presented that number of interval half width is standing on the 0.49, almost 0.5 for individual response Y. Hence, there is an optimistic sign for predictors and investors as a reference. However, there is not really encourage investors to doing investment due to the figure of prediction is not reached 1 yet. So, there is still have some risks to impact on the share price. Also, the prediction may not accurate enough because there is a lack of related samples to analysis. On the other hand, share market has got plenty of uncertainties, therefore, the report user can not just rely on data analysing, also still need to think twice after apply this report and before doing investment.

 

The Durbin-Watson test

Durbin-Watson Calculations
Sum of Squared Difference of Residuals 24.72287316
Sum of Squared Residuals 11.96892238
Durbin-Watson Statistic 2.065588896

 

The Durbin-Watson test is applied for examine the presence of serial correlation residual. The value of Durbin-Watson statistic ranges from 0 to 4, and the acceptable range is from 1.50 to 2.50 (Evans, 2013). According to the graph, it shows our Durbin-Watson statistic is 2.065588896. As the result, the outcome is acceptable.If the Durbin-Watson statistic was not in the acceptable range, we would add a caution to the findings for a violation of regression assumptions.

 

 

Conclusion

To sum up, according to the regression model analysing, there is no so much influences on future change by 5 independent variables. Originally, we indicate that gold and silver are both important raw materials for products and also paly as a financial solver to helps company to balance revenue and make up a deficit from international exchange rates. However, those two materials are not having as high relevance with share price as expected. It might cause both raw materials are not key element of products and also Apple’s financial apartment would focus on other investments rather than gold and silver markets to reduce risks. On the other hand, oil and Baltic index are another point of view to forecasting Apple’s future change. Unfortunately, those two factors have no strong relationship with Apple’s share price. The reason to drives those factors have no such impacts on Apple’s share price might due to different shipping and investment strategy to avoid risks. What is more, we also brought SP500 to compare with Apple’s business performances and predicted future change of share price. However, the figure illustrated that Apple’s share price has no strong relevance with SP500 as well. As the result, Apple’s business performing and SP500 is unhooked. In other words, Apple’s business is operating more independently.

References

 

Evans, J. R. (2013). Statistics, Data Analysis and Decision Modeling (5th ed.). Harlow, England: Pearson Education Limited

Yahoo Finance. (2014). Apple Profil. Retrived January 04, From http://finance.yahoo.com/q/pr?s=aapl

 

Order Now

http://zelessaywritings.com/order/