This assignment requires a considerable amount of computer work and written comment. You may need to seek guidance from your tutor along the way. Do not leave the Assignment until too late. Each question carefully describes what you are required to do, so please follow the instructions carefully. Your answer to each question should begin with the number of the question. Refer to the General Marking Guide below for details.
In this assignment you will examine data on employees of a medium sized company. The variables which have been recorded for each employee are listed below, and the data is contained in the file Salary.xls.
Id employee identity number
Gender gender of staff member
Start yr year employee started work with the company
Dept department 1:warehousing; 2:import/export ; 3:production 4.management
Begin $ initial salary
2011 $ salary in 2011
Position position in company.1.management 2Clerical3.production
The company is about to enter into negotiations over the salaries of production workers and management has asked for a report on their current pay.
Random Sample: Before you begin your analysis you are required to take a random sample of size 100. Use the file Random Sampler Generator.xls to do this. Your tutor will show you how this can be done in EXCEL. Your answers to the assignment tasks below are to be based on your sample of 100 cases. Make sure to keep a safe copy of your sample since you cannot use Random Sampler Generator.xls to reproduce the first sample. Provide a printout of the data in your sample, with ID numbers in ascending order. When sorting your sample remember to highlight all of your sample data before sorting. Copy your sample file to another spreadsheet for working your assignment. Make sure you save your random sample.
1. Determine the mean, standard deviation and standard error for the production workers in your sample. Use this information to carry out a t-test to see if the average pay for production workers is less than $49,000. (5 marks)
2. Construct a new variable called Experience which is equal to the length of time each employee has worked for the company up to the year 2011. State the formula used for the new variable. Obtain a scatter-plot of Current Pay vs Experience for Production staff. Think carefully about which variable should go on the vertical axis – remember, it is the independent variable that goes on the horizontal axis (i.e. the x-axis). Make sure you label your axes properly and that your graph has an appropriate title. Briefly describe the nature of the relationship between these two variables. (8 marks)
3. Use XL to carry out a regression analysis on these two variables. (This is for the production staff only.) Copy the output into your assignment and use it to respond to the following:
o Write down the regression equation.(2 marks) o State the R-Square value and the Standard Error and explain what they mean with
respect to this data. (2 marks)
o Write down the value of the gradient of the regression line and explain what it means in this case. (2 marks)
o Is the constant or intercept value significant in this case? How do you know this? (2 marks)o Briefly explain why you think this regression model is, or is not, a good model. (2 marks)
4. For production staff, obtain a stratified scatterplot distinguishing males and females. Make sure that the scatterplot is labeled appropriately and that the markers for males and females can be easily distinguished and are an appropriate size. Your scatterplot should include trendlines, equations and R2 values for each group. (2 marks)
Provide a brief comparison of the pay for the two groups, noting important differences, if any, between them. (3 marks) Do males receive better pay than females? Do males receive better pay increases than females? (3 marks)
5. Using information from your analyses write a short concluding paragraph about the pay of production staff and, in particular, the pay for male and female production staff. (2 marks)
Bonus mark: If overall flow and structure of discussion, technical language, grammar and spelling is sound (up to 5 marks).
General Assessment Criteria:
First class level student work will display an ability to apply conceptual understanding of statistics and data analysis techniques to practical situations. In this case, such work will show evidence of student’s ability to:
Identify the problem of interest and its practical importance.
Source appropriate data for analysis.
Use appropriate tools to extract and analyse the data.
Determine the best method of data analysis and identify limitations of such method.
Explain results with clarity and confidence. Other levels of work: The further the student’s work deviates from the generalized ideal described above, the lower their resulting mark is likely to be.