Experiment Design for Data Science

Submitted by webmaster on Mon, 01/15/2018 - 16:33
Course No: 
188992
Course Type: 
VU
Term: 
2017W
Weekly Hours: 
2.0
Lecturer: 
Peter Knees
Allan Hanbury
Alexander Schindler
Language: 
English
Objective: 

This course gives an introduction to data science. The emphasis is on strategies for the design of experiments, considering both workflow paradigms and aspects of reproducibility and traceability of solutions. Furthermore, knowledge about the lifecycle of data, from acquisition through processing and analysis to the long-term provision and reuse, is covered. Students are also introduced to the complex legal and ethical aspects of working with data.
 
 

Content: 

The following topics are covered in the lectures:

  • Introduction to Data Science
  • Data and the data lifecycle
  • Conceptual Experiment design
  • Workflow paradigms
  • Data management, reproducibilty and traceability
  • Experiment error analysis and statistical testing
  • Advanced experiment design

In addtion, two exercises will be done.
 
The effort breakdown is:
7 2-hour lectures, including one multiple choice quiz: 14hExercise 1: 15hExercise 2: 25hExam preparation: 20hExam: 1hSUM: 75h
 
 

Information: 

Syllabus
(alle Seminarraum von Neumann, Mi, 16-18h)BLOCK 118. Okt: Introduction to data science - data science process, algorithmic ethics, human-in-the-loop  -Hanbury25. Okt: Data and the data lifecycle (include ethical and legal aspects introduction) -HanburyBLOCK 2 8. Nov: Conceptual Experiment Design: Planning and Execution of Experiments, Crisp-DM  -Knees22. Nov: Workflow paradigms and Scientific Workflow Environments: Taverna, Kepler, Myexperiments.org, environment set-up: iPython, iPython Notebook Versioning, Yesworkflow, Noworkflow;   -Schindler, KneesExercise 1: Design an experimental workflow for a given dataset (start: 22.11, hand-in: 12.12)BLOCK 329. Nov: Facilitating reproducibility and traceability; Basics data management planning and data stewardship;  - Rauber 6. Dez: Experiment Error Analysis and Statistical Testing  -Knees13. Dez: Deep Experiment Design (statistical power, application in workflows, metastudies, ...) -KneesExercise 2: Reproduce experimental results from a paper (start: 29.11, zwischenabgabe: 5.12, hand-in: 19.1)24. Jan: Exam

Notes: 
Examination: 

<ul>
<li>Ex1: 1..100 points. Minimum 35.</li>
</ul>
<ul>
<li>Ex2: 1..100 points. Minimum 35.</li>
</ul>
<ul>
<li>Exam: 1..100 points. Minimum 35.</li>
</ul>
<ul>
<li>Final Grade=0.20*Ex1+0.35*Ex2+0.45*Exam. Minimum 50.</li>
</ul>

Recommendation: