SIT731 - Data Wrangling
Unit details
| Year: | 2022 unit information |
|---|---|
| Important Update: | Unit delivery will be in line with the most current COVIDSafe health guidelines. We continue to tailor learning experiences for each unit to achieve the best possible mix of online and on-campus activities that successfully blend our approaches to learning, working and research. Please check your unit sites for announcements and updates. Last updated: 4 March 2022 |
| Enrolment modes: | Trimester 1: Burwood (Melbourne), Online Trimester 3: Burwood (Melbourne), Online |
| Credit point(s): | 1 |
| EFTSL value: | 0.125 |
| Unit Chair: | Trimester 1: Marek Gagolewski Trimester 3: Marek Gagolewski |
| Prerequisite: | SIT774. |
| Corequisite: | Nil |
| Incompatible with: | SIT220 |
| Typical study commitment: | Students will on average spend 150 hours over the trimester undertaking the teaching, learning and assessment activities for this unit. |
| Scheduled learning activities - campus: | 1 x 3-hour active class per week |
| Scheduled learning activities - cloud: | Online independent and collaborative learning including optional scheduled activities as detailed in the unit site. |
Content
Data Science (DS) and Artificial Intelligence (AI) are popular fields in making sense of data that have been collected in large quantities from various sources. Performing accurate exploration and modelling using DS and AI heavily rely on appropriately prepared data. Data wrangling is the process of preparing the raw data appropriately for modelling purposes. The aim of this unit is to learn various data wrangling methodologies and programming techniques to perform them. This include programming in Python for performing various data wrangling tasks, learning data extraction methods from different sources, working with different types of data, storing and retrieving them, applying sampling techniques and inspecting them, cleaning them by identifying outliers/anomalies, handling missing data, transforming, selecting and extracting features, performing exploratory analysis, visualisation using various tools, summarising data appropriately, performing basic statistical analysis and modelling using basic machine learning. Further, techniques for maintaining data privacy and exercising ethics in data manipulation will be covered in this unit.
| ULO | These are the Learning Outcomes (ULO) for this unit. At the completion of this unit, successful students can: | Deakin Graduate Learning Outcomes |
|---|---|---|
| ULO1 | Undertake data wrangling tasks by using appropriate programming and scripting languages to extract, clean, consolidate, and store data of different data types from a range of data sources | GLO1: Discipline-specific knowledge and capabilities |
| ULO2 | Research data discovery and extraction methods and tools and apply resulting learning to handle extracting data based on project needs. | GLO3: Digital literacy |
| ULO3 | Design, implement, and explain the data model needed to achieve project goals, and the processes that can be used to convert data from data sources to both technical and non-technical audiences | GLO1: Discipline-specific knowledge and capabilities |
| ULO4 | Use both statistical and machine learning techniques to perform exploratory analysis on data extracted, and communicate results to technical and non-technical audiences | GLO1: Discipline-specific knowledge and capabilities |
| ULO5 | Apply and reflect on techniques for maintaining data privacy and exercising ethics in data handling. | GLO8: Global citizenship |
These Unit Learning Outcomes are applicable for all teaching periods throughout the year.
Assessment
Trimester 1| Assessment Description | Student output | Grading and weighting (% total mark for unit) | Indicative due week |
|---|---|---|---|
| Learning Portfolio | Portfolio consists of a number of artefacts including scripts, business reports, presentations along with critique and reflections. | 80% | Week 12 |
| Examination | 2 hour written examination | 20% | Examination period |
| Assessment Description | Student output | Grading and weighting (% total mark for unit) | Indicative due week |
|---|---|---|---|
| Learning Portfolio | Portfolio consists of a number of artefacts including scripts, business reports, presentations along with critique and reflections. | 80% | Week 12 |
| Online Quiz | 2 hour online quiz | 20% | Week 11 |
The assessment due weeks provided may change. The Unit Chair will clarify the exact assessment requirements, including the due date, at the start of the teaching period.
Hurdle requirement
Trimester 1: To be eligible to obtain a pass in this unit, students must meet certain milestones as part of the portfolio, and must achieve a mark of at least 50% in the examination.
Trimester 3: To be eligible to obtain a pass in this unit, students must meet certain milestones as part of the portfolio, and must achieve a mark of at least 50% in the online quiz.
Learning Resource
There is no prescribed text. Unit materials are provided via the unit site. This includes unit topic readings and references to further information.
The texts and reading list for the unit can be found on the University Library via the link below: SIT731 Note: Select the relevant trimester reading list. Please note that a future teaching period's reading list may not be available until a month prior to the start of that teaching period so you may wish to use the relevant trimester's prior year reading list as a guide only.
Unit Fee Information
Click on the fee link below which describes you: