Module title :- Principles of Data Science
Level :- 7
Assessment title :- Statistical Analysis and Interactive Dashboard Design
Weighting with in module :- This assessment is worth 100% of the overall module mark.
How to submit :- Your assignment should be submitted through blackboard and should be separated into two formats. First a single pdf report and second a zipped file containing your codes and dashboard.
Please check that the report file and zip file are:
1. Your report has been named as your name.pdf.
2. Check the zip file is valid and openable.
3. The zip file should contain material that are clearly labelled and fully working versions of R codes and dashboards should be included with a clearly written description of each application and its use in a Read Me.txt file. Your dashboard should be shared as a .twb file if you have used Tableau and .pbix if you have used Power BI.

Statistical Analysis and Interactive Dashboard Design

Assessment task details and instructions :-
Your task is to demonstrate your newly developed knowledge and under standing of data handling validation statistical analysis and visualisation by exploring and presenting data from an extensive and complex data set.

There are two sources of data for this assignment:
1. World Development Indicators (WDI)
2. UNdata

World Development Indicators WDI and UN data are the main World Bank collection of indicators & United Nations Data Bank compiled from officially recognised inter national sources.

Both sources include national regional and global estimates. Both sources include numerous indicators for countries around the world. Also there are yearly slices of data for the countries that could be construed as time series.

The data set for this assignment can be accessed from one of these two sources of your choice:

Once you have followed the above links you can download the data set by selecting the countries and variables you want to work with.

This assessment requires a comprehensive statistical analysis and a working dash board proto type to be presented.
Your statistical analysis and interactive dash board design should be fully justified and explained in your report.

The assessment also requires you to demonstrate the justified use of techniques for data preparation validation analysis and or modelling and prediction appropriately referencing research into dashboard composition layout function and form. Justification of the approaches taken for statistical analysis and visualisation is expected and outputs should be provided. Your reasoned thinking research and critical evaluation of both the problem resolution and your solution also form a sub stantive part of this work.

Task 1: Interactive Dashboard Design
For this first task imagine you are working as a Data Scientist at a Non Government Organisation involved in social and economic development globally.

Part of your role is to use data to communicate these issues to a wider public. As your first assignment at the organization they have asked you to select some indicators using the data sources above which you believe tell a significant story and to produce a single-screen interactive dashboard to present this data. For example it could be to compare the trade situation of b the least developed countries with developed countries. Your dashboard is to be made publicly available on their website so you should consider how you can present the data to a general audience who may not have existing expertise in the subject you choose.

The requirements for the proposed dashboard are:
1. Clearly define the objectives of the dashboard based on the dataset you have selected.
2. Based on the objectives select at least 10 suitable countries of your choice.
3. Produce a single screen interactive dashboard of at least 10 countries data.
4. Clear effective presentation of all factors in a coherent intuitively comprehensive form reflecting the objectives you have set for your dashboard.

5. A design applicable to the full range of countries presented in the data set with out modification to the dashboard form or structure. (i.e., the dashboard should support a
side by side comparison of multiple countries and/or financial years).

Along side the dashboard design you should provide a full report which summarises:
• The objectives you have defined for your dashboard indicating clearly what your
planned solution will communicate to your audience
• The data visualisation principles which have informed your dashboard design with
reference to literature and best practice in data visualisation
• The steps you have taken to pre-process and prepare the data
• An overview of your design with a full justification of the design rationale

For extra credit you should also implement the following advanced features in your
dashboard design:

• Use of DAX (if using Power BI) or Calculations (if using Tableau)
• Use of relationships in your data model
• Use of hierarchies, grouping or binning
• Use of in-built Power BI / Tableau forecasting tools

To receive extra credit these features must be fully documented in your ac companying report.

Task 2: Statistical Analysis
The requirements for the proposed statistical analysis are:
1. Define research objectives based on the data set. For instance to compare the trade situation of the least developed countries with developed countries.
2. Based on the objectives, select at least 10 suitable countries of your choice.
3. Choose a set of indicators according to the objectives with at least 10 years of data.
4. Start to complete the following tasks. Also present and interpret your findings and results in the report as much as you can and show the R analytics steps.
4.1. Do a comprehensive descriptive statistical analysis (e.g., Mean Median Mode
Standard deviation Skewness and Kurtosis on the data.

4.2. Do a correlation analysisforthe indicators and evaluate the results in the context of your stated objectives.

4.3. Do regression analysis. Explain why the selected regression techniques are appropriate for the selected variables and defined objectives and show if you’ve found any similar research in the literature.

4.4. Do time series analysis. Explain why the selected techniques are better for the defined objectives and show if you’ve found any similar research in the literature.

4.5. As a researcher do a comparative analysis of the main hypothesis testing approaches for your objectives explain when and why they are used.
Then define at least two hypo theses testing related to the objectives and test them.

5. In general describe the steps that youve taken for data preparation outlier detection dealing with missing data and data privacy protection.

1. You can use a similar datasets and objectives for both tasks. Although. if you prefer you
can select different objectives and datasets for each task.
2. You must use R programming language for the entire statistical analysis part (task 2).
3. You can use Table au or Power BI to develop the dashboard.
4. You can mix different data sets variables to make your own data set in a meaningful & correct format.

Assessed intended learning outcomes:
On successful completion of this assessment you will be able to:

A- Knowledge and Under standing

1. Analyse a data science project to devise a structure for its implementation analysis
and evaluation justifying any decisions made.
2. Critically assess the relative strengths and uses of a range of statistical analysis
techniques including t-tests ANOVA various regression models and categorical data
analysis test of hypothesis and time series analysis).
3. Present and visualise the statistical results analysing key findings.
4. Evaluate the quality of graphs according to their expressiveness and effectiveness.

B- Practical Professional or Subject Specific Skills
1. Understand the history and context of data science ethics skills challenges and
methodologies the term implies.
2. Will learn how to work with a real world data set that possibly is not in your do main expertise and you don’t have prior knowledge and under standing of that field.
3. Develop skills in presenting quantitative data using appropriate dis plays tabulations & summaries.
4. Under stand the nature of sampling variation and the role of statistical methods in
developing and testing hypotheses.
5. Select and use appropriate statistical methods in the analysis of complex data sets.
6. Present findings based on statistical analysis in a clear concise and understand able manner.
7. Select the proper visualization methods for a given data analysis and presentation

C- Transferable Skills and other Attributes
5. Technical report writing.
6. Ability to use tools and techniques for statistical analysis.
7. Presenting data in a manner accessible to non-technical stakeholders.
8. Data Science Ethics, Information governance information Literacy and Data Protection

Module Aims
The module is focused on the underpinning knowledge and practical skills needed for working with in the data sciences industry.

Word count/ duration (if applicable)
Your assessment should be between 6000 to 8000 words (between 30 to 40 pages).

Feedback arrangements
You can expect to receive individual feedback in the form of an annotated marking matrix with specific comments for each section general comments for the work and up to 3 specific areas for improvement.

Support arrangements
You can obtain support for this assessment by contacting Dr Kaveh Kiani or Nathan Topping for the technical aspects of the module. Further support can be obtained from the university as follows:

Assessment Criteria
It would be best to look at the assessment criteria to determine what we are explicitly looking at during the assessment.

In Year Retrieval Scheme
Your assessment is not eligible for in year retrieval. If you are eligible for this scheme you will be contacted shortly after the feedback deadline.

If you fail your assessment and are eligible for reassessment you will be allowed to re-do the assignment based on the feedback given. The submission for this will be based on university’s re assessment calendar and routines.

