BIOSTAT823
Download as PDF
Statistical Program for Big Data
Biostat & Bioinformatic Dept
AHCG - Allied Health Graduate
Subject
BIOSTAT
Catalog Number
823
Title
Statistical Program for Big Data
Course Description
This course describes the challenges faced by analysts with the increasing importance of large data sets, and the strategies that have been developed in response to these challenges. The core topics are how to manage data and how to make computation scalable. The data management module covers guidelines for working with open data, and the concepts and practical skills for working with in-memory, relational and NoSQL databases. The scalable computing module focuses on asynchronous, concurrent, parallel and distributed computing, as well as the construction of effective workflows following DevOps practices. Applications to the analysis of structured, semi-structured and unstructured data, especially from biomedical contexts, will be interleaved into the course. The course examples are primarily in Python and fluency in Python is assumed. Credits: 3
Grading Basis
ABCDF Grading
Consent (Permission Number)
No Special Consent Required
Min Units
3
Max Units
3
Lecture