## STATS1900 Business Statistics Major Assignment

This assignment requires a considerable amount of computer work and written comment. You may need to seek guidance from your tutor along the way. Do not leave the Assignment until too late. Each question carefully describes what you are required to do, so please follow the instructions carefully. Your answer to each question should begin with the number of the question. Refer to the General Marking Guide below for details.

In this assignment you will examine data on employees of a medium sized company. The variables which have been recorded for each employee are listed below, and the data is contained in the file Salary.xls.

Variable              description

Id                         employee identity number

Gender                gender of staff member

Start yr                year employee started work with the company

Dept                     department 1:warehousing; 2:import/export ; 3:production 4.management

Begin \$                  initial salary

2011 \$                     salary in 2011

Position                  position in company.1.management 2Clerical3.production

The company is about to enter into negotiations over the salaries of production workers and management has asked for a report on their current pay.

2 marks

1.                Determine the mean, standard deviation and standard error for the production workers in your sample. Use this information to carry out a t-test to see if the average pay for production workers is less than \$49,000.  (5 marks)

2.                Construct a new variable called Experience which is equal to the length of time each employee has worked for the company up to the year 2011. State the formula used for the new variable.  Obtain a scatter-plot of Current Pay vs Experience for Production staff. Think carefully about which variable should go on the vertical axis – remember, it is the independent variable that goes on the horizontal axis (i.e. the x-axis). Make sure you label your axes properly and that your graph has an appropriate title.  Briefly describe the nature of the relationship between these two variables.  (8 marks)

3.                Use XL to carry out a regression analysis on these two variables. (This is for the production staff only.) Copy the output into your assignment and use it to respond to the following:

o Write down the regression equation.(2 marks) o State the R-Square value and the Standard Error and explain what they mean with

respect to this data. (2 marks)

o Write down the value of the gradient of the regression line and explain what it means in this case. (2 marks)

o Is the constant or intercept value significant in this case? How do you know this? (2 marks)o Briefly explain why you think this regression model is, or is not, a good model. (2 marks)

4.                For production staff, obtain a stratified scatterplot distinguishing males and females. Make sure that the scatterplot is labeled appropriately and that the markers for males and females can be easily distinguished and are an appropriate size.  Your scatterplot should include trendlines, equations and R2 values for each group. (2 marks)

Provide a brief comparison of the pay for the two groups, noting important differences, if any, between them. (3 marks) Do males receive better pay than females? Do males receive better pay increases than females?  (3 marks)

5.                Using information from your analyses write a short concluding paragraph about the pay of production staff and, in particular, the pay for male and female production staff.  (2 marks)

Bonus mark: If overall flow and structure of discussion, technical language, grammar and spelling is sound (up to 5 marks).

General Assessment Criteria:

First class level student work will display an ability to apply conceptual understanding of statistics and data analysis techniques to practical situations. In this case, such work will show evidence of student’s ability to:

  Identify the problem of interest and its practical importance.

  Source appropriate data for analysis.

  Use appropriate tools to extract and analyse the data.

  Determine the best method of data analysis and identify limitations of such method.

  Explain results with clarity and confidence.  Other levels of work: The further the student’s work deviates from the generalized ideal described above, the lower their resulting mark is likely to be.