BIOSTAT823

Download as PDF

Statistical Program for Big Data

Biostat & Bioinformatic Dept AHCG - Allied Health Graduate

Subject

BIOSTAT

Catalog Number

823

Title

Statistical Program for Big Data

Course Description

This course describes the challenges faced by analysts with the increasing importance of large data sets, and the strategies that have been developed in response to these challenges. The core topics are how to manage data and how to make computation scalable. The data management module covers guidelines for working with open data, and the concepts and practical skills for working with in-memory, relational and NoSQL databases. The scalable computing module focuses on asynchronous, concurrent, parallel and distributed computing, as well as the construction of effective workflows following DevOps practices. Applications to the analysis of structured, semi-structured and unstructured data, especially from biomedical contexts, will be interleaved into the course. The course examples are primarily in Python and fluency in Python is assumed. Credits: 3

Grading Basis

ABCDF Grading

Consent (Permission Number)

No Special Consent Required

Min Units

3

Max Units

3

Lecture