## Data Analysis and Decision Modelling

Assignment Requirements

So basically I have the Excel chart and the example assignment from my friend. What I need you to do is to use the Excel chart data and apply it into my friend’s assignment. Not to copy just use your own word and make it like a new work but use my friend’s assignment as reference. and Probably add more definitions for it.

So to sum up. All the numbers/data you will need to use from the excel chart and writing style I want you to follow my friend’s assignment as reference.

I have only 12hours. so please let me know if anything you dont understand now cuz I’ll be sleeping in 2hours(3am) cuz it’s 1am now.

I will upload all the document for you.

Introduction

The purpose of this essay is going to illustrate that relationship and interactions between variable factors and regression model, and then predicting future changes of company – Apple’s share price in share market. This report is constructed by data collecting, distinguishing and built up a multiple regression model to analysing information and graphs, such as P value, VIF and F test etc.

Original Data:

 Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change 12/11/2013 0.021007 0.039063 0.182188 0.103333 0.291806 0.2625 11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667 8/11/2013 0.08724 0.015625 0.146979 0.234896 0.372431 0.395833 7/11/2013 0.034236 0.012708 0.110677 0.360972 0.045833 0.658333 6/11/2013 0.076563 0.197777 0.095 0.475694 0.331459 0.320833 5/11/2013 0.054861 0.235989 0.011458 0.375 0.335313 0.325 4/11/2013 0.125417 0.393021 0.014236 0.155694 0.473611 0.254167 1/11/2013 0.079132 0.5875 0.038889 0.055087 0.516076 0.541667 31/10/2013 0.076823 0.60783 0.201562 0.040399 0.539688 0.554167 30/10/2013 0.209792 0.092083 0.196875 0.014844 0.437708 0.445833 29/10/2013 0.068177 0.395416 0.382673 0.022083 0.455381 0.745833 … … … … … … … … … … … … … … 11/02/2013 0.416666 0.129167 0.444618 0.455381 0.119401 0.408333 8/02/2013 0.628334 0.551597 0.189496 0.367362 0.189844 0.6 7/02/2013 0.847084 0.477969 0.148541 0.015625 0.053437 0.916667 6/02/2013 0.320833 0.364653 0.183768 0.012413 0.13125 0.929167 5/02/2013 0.824913 0.322917 0.193958 0.004375 0.305278 0.3 4/02/2013 0.075347 0.362465 0.232916 0.005208 0.195972 0.641667 1/02/2013 0.496875 0.44625 0.008594 0.006094 0.385486 0.475 31/01/2013 0.419966 0.30586 0.019201 0.022656 0.407057 0.820833

Those data which applied in this essay are collected from Yahoo finance (Yahoo finance, 2013). Yahoo finance contains a range of statistics, and those data is an open resource to use. Therefore, this report picked up 5 different independent variables from Yahoo finance database to analysing the fluctuation of share price and predicting future changes of share price.

Identification of Variables

Y variable

Variable: future change

X variable

Variable 1: Gold_x_AAPL_acc1

The figure shows that relationship between gold and Apple.

Variable 2: Silver_vel10_x_Baltic_dry_acc2

The number asserts that raw material and shipping goods has impacted on Apple’s business operation.

Variable 3: Oil_vel5_x_AAPL_vel7

The information points out that the relevance between oil prices and the cost of shipping goods worldwide.

Variable 4: Baltic_dry_vel5_x_AAPL_vel11

According to the figure to understanding the cost of shipping goods worldwide, such as deliver finish goods from manufactory to markets.

Variable 5: 5year_vel4_x_SP500_vel12

By comparing the short term share price with SP500 figure to demonstrates Apple’s performing in the market.

Check inputs for collinearity

Collinearity is the undesirable situation where the correlations among the independent variables are strong. Collinearity will misleadingly inflatethe standard errors (Evans, 2013). Thus, it makes some variables statistically may be found not to be significantly different from 0 while they should be otherwise significant.

First of all, we want to see if any of those inputs have collinearity by using PHStat Multiple Regression function to analysis VIF.VIF, variance inflation faction is a figure to understand issue in linear model. When independent variable has got linear issue, then it means there are some similar explanations between independent variables. Moreover, the issue drives researcher could not demonstrate how many degree of impact on Y. Researcher would recognise there is no linear problem when the number of VIF smaller. If VIF for one of the variables is around or greater than 5, there is collinearity associated with that variable, one of these variables must be removed from the regression model (Evans 2013). As can be seen from the case, all 5 independent variable’s VIF are less than 5, therefore, there is no linear issue in this report.

From the result’s outputs show that as below:

 Regression Analysis 5year_vel4_x_SP500_vel12 and all other X Regression Statistics Multiple R 0.2601 R Square 0.0677 Adjusted R Square 0.0484 Standard Error 0.2053 Observations 199 VIF 1.0726
 Regression Analysis Baltic_dry_vel5_x_AAPL_vel11 and all other X Regression Statistics Multiple R 0.4759 R Square 0.2265 Adjusted R Square 0.2105 Standard Error 0.1770 Observations 199 VIF 1.2928

 Regression Analysis Silver_vel10_x_Baltic_dry_acc2 and all other X Regression Statistics Multiple R 0.3629 R Square 0.1317 Adjusted R Square 0.1138 Standard Error 0.2020 Observations 199 VIF 1.1517

 Regression Analysis Oil_vel5_x_AAPL_vel7 and all other X Regression Statistics Multiple R 0.5462 R Square 0.2984 Adjusted R Square 0.2839 Standard Error 0.1985 Observations 199 VIF 1.4253

 Regression Analysis Gold_x_AAPL_acc1 and all other X Regression Statistics Multiple R 0.1711 R Square 0.0293 Adjusted R Square 0.0093 Standard Error 0.1758 Observations 199 VIF 1.0301

From the output table shows that all the VIF are less than 5.

Check residuals are normally distributed

The other way to check whether the inputs have collinearity, we will check whether the residuals are normally distributed. By using the excel function data analysis we can compute the following graphs.

From the last Graph, it is a straight line (or almost straight), the residuals are normally distributed, and all our hypothesis tests will be accurate.

Check which inputs are helping.

In this step, we are going to check for two important figures – adjust R square and P-value.

R square

In the regression model, R square is presenting the proportion of variation which clarifies by independent variables. As the result, R square could be a standard of the degree of accuracy on the prediction between X and Y. The ratio of the regression sum of squares (SSR) to the total sum of squares (SST) is resolved R square’s value. Furthermore, adjusted R square is the number which excludesor reduces impacts from other independent variables.

As can be seen from the regression analysis, the number of adjusted R square is 0.158405, meanwhile, it pointed out that there is a weak relationship between X and Y by 15% only. Therefore, even there may some other elements to effect on variable Y but the outcome would not be different due to the feeble relationship amid Y and Xs. However, the relationship is still statistically significant.

 Regression Statistics Multiple R 0.42386 R Square 0.179658 Adjusted R Square 0.158405 Standard Error 0.249028 Observations 199

 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.497323086 0.041719076 11.92076 6.581E-25 0.415039232 0.579606941 0.415039232 0.579606941 Gold_x_AAPL_acc1 0.18629918 0.101725034 1.8314 0.06858258 -0.014336327 0.386934688 -0.014336327 0.386934688 Silver_vel10_x_Baltic_dry_acc2 0.377818335 0.088513082 4.268503 3.083E-05 0.203241179 0.552395491 0.203241179 0.552395491 Oil_vel5_x_AAPL_vel7 0.108244958 0.090080908 1.201641 0.23097488 -0.069424471 0.285914387 -0.069424471 0.285914387 Baltic_dry_vel5_x_AAPL_vel11 -0.256465655 0.100999 -2.539289 0.01189634 -0.455669182 -0.05726213 -0.455669182 -0.057262129 5year_vel4_x_SP500_vel12 -0.361791706 0.087098442 -4.153825 4.9072E-05 -0.533578722 -0.19000469 -0.533578722 -0.19000469

In the output sheet, the Adjusted R square is 0.158405. And we look further down where we have to check each P-value for each input. If input has a p-value that is more than 0.05, then we should consider deleting that whole column of inputs from the original data. As the above table shows that there are two inputs data P- value greater than 0.05, so that we should go back to original data to delete those two columns to have look how Adjust R square changed.

 Date Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change 12/11/2013 0.039063 0.103333 0.291806 0.2625 11/11/2013 0.064445 0.253993 0.301476 0.316667 8/11/2013 0.015625 0.234896 0.372431 0.395833 7/11/2013 0.012708 0.360972 0.045833 0.658333 6/11/2013 0.197777 0.475694 0.331459 0.320833 5/11/2013 0.235989 0.375 0.335313 0.325 4/11/2013 0.393021 0.155694 0.473611 0.254167 1/11/2013 0.5875 0.055087 0.516076 0.541667 31/10/2013 0.60783 0.040399 0.539688 0.554167 30/10/2013 0.092083 0.014844 0.437708 0.445833 29/10/2013 0.395416 0.022083 0.455381 0.745833 … … … … … … … … … … 11/02/2013 0.129167 0.455381 0.119401 0.408333 8/02/2013 0.551597 0.367362 0.189844 0.6 7/02/2013 0.477969 0.015625 0.053437 0.916667 6/02/2013 0.364653 0.012413 0.13125 0.929167 5/02/2013 0.322917 0.004375 0.305278 0.3 4/02/2013 0.362465 0.005208 0.195972 0.641667 1/02/2013 0.44625 0.006094 0.385486 0.475 31/01/2013 0.30586 0.022656 0.407057 0.820833

After delete those two columns, output tables will show as below:

 Regression Statistics Multiple R 0.399072 R Square 0.159258 Adjusted R Square 0.146324 Standard Error 0.250809 Observations 199 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.53142673 0.038261011 13.8895108 6.05374E-31 0.455968204 0.606885246 0.455968204 0.606885246 Silver_vel10_x_Baltic_dry_acc2 0.42743504 0.085345172 5.008309548 1.22576E-06 0.259116945 0.595753133 0.259116945 0.595753133 Baltic_dry_vel5_x_AAPL_vel11 -0.206687 0.09149364 -2.2590309 0.024988049 -0.38713109 -0.02624283 -0.38713109 -0.02624283 5year_vel4_x_SP500_vel12 -0.3276067 0.085414441 -3.83549547 0.000169117 -0.49606141 -0.159152 -0.49606141 -0.159152

After delete two columns of data Adjusted R square becomes smaller from 0.158405131316566 to 0.146324.Because after delete 2 columns data, adjust R square become smaller. Hence we should un-delete that last column we just deleted. Consider about the change of the Adjust R square, instead of delete two columns in the original data, we are going to delete the column with higher P-value. Table will show as below.

 Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change 12/11/2013 0.021007 0.039063 0.103333 0.291806 0.2625 11/11/2013 0.027378 0.064445 0.253993 0.301476 0.316667 8/11/2013 0.08724 0.015625 0.234896 0.372431 0.395833 7/11/2013 0.034236 0.012708 0.360972 0.045833 0.658333 6/11/2013 0.076563 0.197777 0.475694 0.331459 0.320833 5/11/2013 0.054861 0.235989 0.375 0.335313 0.325 4/11/2013 0.125417 0.393021 0.155694 0.473611 0.254167 1/11/2013 0.079132 0.5875 0.055087 0.516076 0.541667 31/10/2013 0.076823 0.60783 0.040399 0.539688 0.554167 30/10/2013 0.209792 0.092083 0.014844 0.437708 0.445833 29/10/2013 0.068177 0.395416 0.022083 0.455381 0.745833 … … … … … … … … … … … … 11/02/2013 0.416666 0.129167 0.455381 0.119401 0.408333 8/02/2013 0.628334 0.551597 0.367362 0.189844 0.6 7/02/2013 0.847084 0.477969 0.015625 0.053437 0.916667 6/02/2013 0.320833 0.364653 0.012413 0.13125 0.929167 5/02/2013 0.824913 0.322917 0.004375 0.305278 0.3 4/02/2013 0.075347 0.362465 0.005208 0.195972 0.641667 1/02/2013 0.496875 0.44625 0.006094 0.385486 0.475 31/01/2013 0.419966 0.30586 0.022656 0.407057 0.820833

 Regression Statistics Multiple R 0.416557415 R Square 0.17352008 Adjusted R Square 0.156479257 Standard Error 0.249313166 Observations 199

Compare with these three Adjust R square, after deleting either two columns or one column, both of the adjust R square results are smaller than the original one. Thus, we have to make a decision that we should not take that action to delete those columns.

Make a prediction with a confidence interval

To predict the future change, we will use PHStat’s Confidence interval estimate & Prediction function with 95% confidence intervals. And we will use the original date show below to fill in the table to predict future changes.

 Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change 11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667

 Data Confidence Level 95% 1 Gold_x_AAPL_acc1 given value 0.027378 Silver_vel10_x_Baltic_dry_acc2 given value 0.064445 Oil_vel5_x_AAPL_vel7 given value 0.235156 Baltic_dry_vel5_x_AAPL_vel11 given value 0.253993 5year_vel4_x_SP500_vel12 given value 0.301476

 For Average Predicted Y (YHat) Interval Half Width 0.05934 Confidence Interval Lower Limit 0.318675 Confidence Interval Upper Limit 0.437354 For Individual Response Y Interval Half Width 0.494738 Prediction Interval Lower Limit -0.11672 Prediction Interval Upper Limit 0.872753

As can be seen from the graph as above, the analysing presented that number of interval half width is standing on the 0.49, almost 0.5 for individual response Y. Hence, there is an optimistic sign for predictors and investors as a reference. However, there is not really encourage investors to doing investment due to the figure of prediction is not reached 1 yet. So, there is still have some risks to impact on the share price. Also, the prediction may not accurate enough because there is a lack of related samples to analysis. On the other hand, share market has got plenty of uncertainties, therefore, the report user can not just rely on data analysing, also still need to think twice after apply this report and before doing investment.

The Durbin-Watson test

 Durbin-Watson Calculations Sum of Squared Difference of Residuals 24.72287316 Sum of Squared Residuals 11.96892238 Durbin-Watson Statistic 2.065588896

The Durbin-Watson test is applied for examine the presence of serial correlation residual. The value of Durbin-Watson statistic ranges from 0 to 4, and the acceptable range is from 1.50 to 2.50 (Evans, 2013). According to the graph, it shows our Durbin-Watson statistic is 2.065588896. As the result, the outcome is acceptable.If the Durbin-Watson statistic was not in the acceptable range, we would add a caution to the findings for a violation of regression assumptions.

Conclusion

To sum up, according to the regression model analysing, there is no so much influences on future change by 5 independent variables. Originally, we indicate that gold and silver are both important raw materials for products and also paly as a financial solver to helps company to balance revenue and make up a deficit from international exchange rates. However, those two materials are not having as high relevance with share price as expected. It might cause both raw materials are not key element of products and also Apple’s financial apartment would focus on other investments rather than gold and silver markets to reduce risks. On the other hand, oil and Baltic index are another point of view to forecasting Apple’s future change. Unfortunately, those two factors have no strong relationship with Apple’s share price. The reason to drives those factors have no such impacts on Apple’s share price might due to different shipping and investment strategy to avoid risks. What is more, we also brought SP500 to compare with Apple’s business performances and predicted future change of share price. However, the figure illustrated that Apple’s share price has no strong relevance with SP500 as well. As the result, Apple’s business performing and SP500 is unhooked. In other words, Apple’s business is operating more independently.

References

Evans, J. R. (2013). Statistics, Data Analysis and Decision Modeling (5th ed.). Harlow, England: Pearson Education Limited

Yahoo Finance. (2014). Apple Profil. Retrived January 04, From http://finance.yahoo.com/q/pr?s=aapl

Order Now

http://zelessaywritings.com/order/