Assignment Part 2 (25%) Marking Criteria (Rubric)
– Total Raw Marks: 100
For Task 1 (20 marks) and Task 2 (20marks), the following marking criteria is applied for each
Criteria |
Exemplary (10-9) |
Good (8-7) |
Satisfactory (6-5) |
Limited (4-3) |
Very Limited (2-0) |
Formulate Hypothesis 10% (2 marks) |
Both hypotheses (Null and Alternative) are correctly formed so that the purpose of the analysis can be implied clearly and reasonably. |
Exhibits aspects of exemplary (left) and satisfactory (right) |
The hypothesis is formed but not fully clear or reasonable |
Exhibits aspects of satisfactory (left) and very limited (right) |
Hypotheses are irrelevant or meaningless. |
Data Selection and Management (Data Preprocessing) 20% (4 marks) |
Select appropriate and valid variables from the dataset and apply appropriate data management techniques to connect well to the hypothesis you set and to make the dataset fully ready for the following further analysis. Jobs required for this may include: – Determine categorical or quantitative variables appropriately – Determine explanatory and respond variables appropriately – Apply appropriate subsampling – Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed. – Apply appropriate operations for recoding labels or handling missing or invalid data |
Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations. |
Applied limited or no data management techniques to the dataset provided |
||
Inference Testing 40% (8 marks) |
Use appropriate techniques to apply the required testing. – Correct use of Python commands and arguments to apply the inference testing and to achieve the necessary results. – All testing results (required to assess the evidence) are achieved and summarized fully and correctly. – Necessary charts/plots (to support the testing) are correctly achieved – Correct post hoc test was completed (if required) |
Apply techniques to generate relevant analysis result but not fully desirable with some incorrect. Minor missing or incorrection in the main testing process |
The testing / analysis techniques are applied wrongly or poorly. OR Missing Essential process for the testing. |
||
Draw Conclusions 20% (4 marks) |
Provide appropriate and logical interpretation of the testing/analysis results to elicit useful/correct conclusions so that the hypotheses can be supported. |
Interpretation/conclusion is elicited but not fully correct. |
Limited or no interpretation of the testing results |
||
Notebook Presentation 10% (2 marks) |
All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments. |
Contents included in the Notebook are arranged properly but not fully ideal. |
Notebook contents are not poorly arranged or not readable |
For Task 3 (20 marks), the following marking criteria is applied for each
Criteria |
Exemplary (10-9) |
Good (8-7) |
Satisfactory (6-5) |
Limited (4-3) |
Very Limited (2-0) |
Data Selection and Management (Data Preprocessing) 20% (4 marks) |
Select appropriate and valid variables from the dataset and apply appropriate data management techniques to make the dataset fully ready for the following further analysis. Jobs required for this may include: – Determine quantitative variables appropriately – Determine explanatory and respond variables appropriately – Apply appropriate subsampling – Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed. – Apply appropriate operations for recoding labels or handling missing or invalid data |
Exhibits aspects of exemplary (left) and satisfactory (right) |
Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations. |
Exhibits aspects of satisfactory (left) and very limited (right) |
Applied limited or no data management techniques to the dataset provided |
Regression Analysis
45% (9 marks) |
Use appropriate techniques to complete the required analysis – Correct use of Python commands and arguments to generate the scatter chart, the regression analysis result, and the residual plot. – Draw the regression equation correctly from the regression result |
Apply techniques to generate relevant analysis result but not fully desirable with some incorrect. |
The testing / analysis techniques are applied wrongly or poorly. |
||
Draw Conclusions 25% (5 marks) |
Provide appropriate and logical interpretation of the testing/analysis results to elicit useful/correct conclusions |
Interpretation/conclusion is elicited but not fully correct. |
Limited or no interpretation of the testing results |
||
Notebook Presentation 10% (2 marks) |
All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments. |
Contents included in the Notebook are arranged properly but not fully ideal. |
Notebook contents are not poorly arranged or not readable |
For Task 4 (40 marks), the following marking criteria is applied for each
Criteria |
Exemplary (10-9) |
Good (8-7) |
Satisfactory (6-5) |
Limited (4-3) |
Very Limited (2-0) |
Data Selection and Management (Data Preprocessing) |
Select appropriate and valid variables from the dataset and apply appropriate data management techniques to make the dataset fully ready for the following further analysis. Jobs required for this may include: – Determine quantitative variables appropriately – Determine explanatory and respond variables appropriately – Apply appropriate subsampling – Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed. – Apply appropriate operations for recoding labels or handling missing or invalid data |
Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations. |
Applied limited or no data management techniques to the dataset provided |
||
10% (4 marks) |
|||||
Scatter Plots 10% (4 marks) |
All scatter plots are correctly generated by applying appropriate Python functions. AND All correlation values are generated appropriately. |
Exhibits aspects of exemplary (left) and satisfactory (right) |
Scatter plots and r-values are attempted to generate but nut fully correctly. |
Exhibits aspects of satisfactory (left) and very limited (right) |
No attempt or mostly wrong |
Multiple regression strategy 15% (6 marks) |
Apply a systematic/logical/reasonable strategy and justification to generate/test various (three or more) combination of multiple individual regression models to compose the final regression model. |
Apply a strategy to generate candidate multiple regression models but limited and not fully systematic or reasonable. The justification is made but not clear or reasonable. |
No strategy is applied or randomly compose the multiple regression |
||
Regression Analysis Results 15% (6 marks) |
Use appropriate techniques to generate all candidate regression models following the strategy you set up. – Correct use of Python commands and arguments to generate the regression analysis result, and regression equations accordingly |
Apply techniques to generate relevant analysis result but not fully desirable with some incorrect. |
The testing / analysis techniques are applied wrongly or poorly. |
||
Q-Q plots
15% (6 marks) |
Use appropriate techniques to generate all QQ plots corresponding to each candidate regression model. AND the conclusion is drawn appropriately from the comparison of all QQ plots generated based on logical justification. |
QQ plots are generated but not fully or correctly. The justification made to draw the conclusion is not fully reasonable or correct. |
No attempt or most QQ plots or conclusions generated are wrong or poor. |
Residual Plots
15% (6 marks) |
All standardized residual plots and relevant result values are correctly generated (for each candidate regression model). AND the conclusion is drawn appropriately from the results |
Corresponding residual plots and conclusions are made but not fully or correctly. |
No attempt or most plots/results generated are wrong or meaningless. |
||
Overall Conclusion 10% (4 marks) |
Provide appropriate and logical interpretation of the overall analysis results to elicit useful/correct conclusions |
Interpretation/conclusion is elicited but not fully correct. |
Limited or no interpretation of the testing results |
||
Notebook Presentation 10% (4 marks) |
All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments. |
Contents included in the Notebook are arranged properly but not fully ideal. |
Notebook contents are not poorly arranged or not readable |