Assignment Part 2 (25%) Marking Criteria (Rubric)

– Total Raw Marks: 100

For Task 1 (20 marks) and Task 2 (20marks), the following marking criteria is applied for each

Criteria

Exemplary (10-9)

Good (8-7)

Satisfactory (6-5)

Limited (4-3)

Very Limited (2-0)

Formulate Hypothesis

10% (2 marks)

Both hypotheses (Null and Alternative) are correctly formed so

that the purpose of the analysis can be implied clearly and reasonably.

Exhibits aspects of exemplary (left) and satisfactory (right)

The hypothesis is formed

but not fully clear or reasonable

Exhibits aspects of satisfactory (left) and very limited (right)

Hypotheses are

irrelevant or meaningless.

Data Selection and Management (Data Preprocessing)

20% (4 marks)

Select appropriate and valid variables from the dataset and apply appropriate data management techniques to connect well to the hypothesis you set and to make the dataset fully ready for the following further analysis. Jobs required for this may include:

– Determine categorical or quantitative variables appropriately

– Determine explanatory and respond variables appropriately

– Apply appropriate subsampling

– Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed.

– Apply appropriate operations for recoding labels or handling missing or invalid data

Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations.

Applied limited or no data management techniques to the dataset provided

Inference Testing

40% (8 marks)

Use appropriate techniques to apply the required testing.

– Correct use of Python commands and arguments to apply the inference testing and to achieve the necessary results.

– All testing results (required to assess the evidence) are achieved and summarized fully and correctly.

– Necessary charts/plots (to support the testing) are correctly achieved

– Correct post hoc test was completed (if required)

Apply techniques to generate relevant analysis result but not fully desirable with some incorrect.

Minor missing or incorrection in the main

testing process

The testing / analysis techniques are applied wrongly or poorly. OR Missing Essential process

for the testing.

Draw Conclusions 20% (4 marks)

Provide appropriate and logical interpretation of the testing/analysis results to elicit useful/correct conclusions so that the hypotheses can be supported.

Interpretation/conclusion is elicited but not fully correct.

Limited or no interpretation of the testing results

Notebook Presentation

10% (2 marks)

All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments.

Contents included in the Notebook are arranged properly but not fully

ideal.

Notebook contents are not poorly arranged

or not readable

For Task 3 (20 marks), the following marking criteria is applied for each

Criteria

Exemplary (10-9)

Good (8-7)

Satisfactory (6-5)

Limited (4-3)

Very Limited (2-0)

Data Selection and Management (Data Preprocessing)

20% (4 marks)

Select appropriate and valid variables from the dataset and apply appropriate data management techniques to make the dataset fully ready for the following further analysis. Jobs required for this may include:

– Determine quantitative variables appropriately

– Determine explanatory and respond variables appropriately

– Apply appropriate subsampling

– Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed.

– Apply appropriate operations for recoding labels or handling

missing or invalid data

Exhibits aspects of exemplary (left) and satisfactory (right)

Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations.

Exhibits aspects of satisfactory (left) and very limited (right)

Applied limited or no data management techniques to the dataset provided

Regression Analysis

45% (9 marks)

Use appropriate techniques to complete the required analysis

– Correct use of Python commands and arguments to generate the scatter chart, the regression analysis result, and the residual plot.

– Draw the regression equation correctly from the regression result

Apply techniques to generate relevant analysis result but not fully desirable with some incorrect.

The testing / analysis techniques are applied wrongly or poorly.

Draw Conclusions 25% (5 marks)

Provide appropriate and logical interpretation of the testing/analysis results to elicit useful/correct conclusions

Interpretation/conclusion is elicited but not fully correct.

Limited or no interpretation of the testing results

Notebook Presentation

10% (2 marks)

All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments.

Contents included in the Notebook are arranged properly but not fully

ideal.

Notebook contents are not poorly arranged

or not readable

For Task 4 (40 marks), the following marking criteria is applied for each

Criteria

Exemplary (10-9)

Good (8-7)

Satisfactory (6-5)

Limited (4-3)

Very Limited (2-0)

Data Selection and Management (Data Preprocessing)

– Determine quantitative variables appropriately

– Determine explanatory and respond variables appropriately

– Apply appropriate subsampling

– Apply appropriate operations to transform the original data type into the different type (e.g. quantitative to categorical) if needed.

– Apply appropriate operations for recoding labels or

handling missing or invalid data

Select appropriate/valid variables and apply appropriate data management techniques, but not completely desirable or missing to apply some necessary operations.

Applied limited or no data management techniques to the dataset provided

10% (4 marks)

Scatter Plots

10% (4 marks)

All scatter plots are correctly generated by applying appropriate Python functions. AND All correlation values are generated appropriately.

Exhibits aspects of exemplary (left) and satisfactory (right)

Scatter plots and r-values are attempted to generate but nut fully correctly.

Exhibits aspects of satisfactory (left) and very limited (right)

No attempt or mostly wrong

Multiple regression strategy

15% (6 marks)

Apply a systematic/logical/reasonable strategy and justification to generate/test various (three or more) combination of multiple individual regression models to compose the final regression model.

Apply a strategy to generate candidate multiple regression models but limited and not fully systematic or reasonable. The justification is made

but not clear or reasonable.

No strategy is applied or randomly compose the multiple regression

Regression Analysis Results

15% (6 marks)

Use appropriate techniques to generate all candidate regression models following the strategy you set up.

– Correct use of Python commands and arguments to generate the regression analysis result, and regression equations accordingly

Apply techniques to generate relevant analysis result but not fully desirable with some incorrect.

The testing / analysis techniques are applied wrongly or poorly.

Q-Q plots

15% (6 marks)

Use appropriate techniques to generate all QQ plots corresponding to each candidate regression model. AND

the conclusion is drawn appropriately from the comparison of all QQ plots generated based on logical justification.

QQ plots are generated but not fully or correctly. The justification made to draw the conclusion is not fully

reasonable or correct.

No attempt or most QQ plots or conclusions generated are

wrong or poor.

Residual Plots

15% (6 marks)

All standardized residual plots and relevant result values are correctly generated (for each candidate regression model). AND the conclusion is drawn appropriately from the results

Corresponding residual plots and conclusions are made but not fully or correctly.

No attempt or most plots/results generated are wrong or

meaningless.

Overall Conclusion 10% (4 marks)

Provide appropriate and logical interpretation of the overall analysis results to elicit useful/correct conclusions

Interpretation/conclusion is elicited but not fully correct.

Limited or no interpretation of the testing

results

Notebook Presentation

10% (4 marks)

All contents included in the Jupyter Notebook are well readable and understandable by adding appropriate section titles (using Markdown sections) and useful inline comments.

Contents included in the Notebook are arranged properly but not fully ideal.

Notebook contents are not poorly arranged

or not readable

Assignment Part 2 (25%) Marking Criteria (Rubric) – Total Raw Marks: 100 For Task 1 (20 marks) and Task 2 (20marks), the following marking criteria is applied for each