This lecture covers the basic programming approaches in Data Science. The emphasis is on computational thinking, the formulation of problems and their solution spaces so that a computer can solve them. Methods for increasing the efficiency of the solutions are also presented. Use cases demonstrate the practical application of data science solutions.
The following topics are covered in the lectures:
- Introduction to Data-Oriented Programming Paradigms
- Python
- SciPy, NumPy, vectorisation, execution performance measurement
- Data preparation, structuring, fusion
- Data Science solution approaches and case studies
- Introduction to network analysis
In addtion, three exercises will be done.
The effort breakdown is:
Python tutorial: 4hLectures: 7 sessions @ 2h: 14hExercises: EX1 (OO vs. DO): 5h EX2 (pandas + sklearn): 10h EX3 (project): 42h [includes review meeting (topic + questions + work plan)]SUM: 75h
Syllabus
All Lectures on Tuesday 11:00-13:00, Seminarraum Gödel, Favoritenstraße 9
- Kickoff-Session, data science process, community, solution examples [Hanbury] (9.10)
- Introduction to DOPP, text stream processing [Böck] (16.10)
- Python tutorial [Böck] (23.10)
- SciPy, NumPy, vectorisation, visualisation, benchmarking [Böck] (30.10)
- Preprocessing, Pandas [Kiesling] (6.11)
- Intro to Machine Learning/sklearn [Hanbury] (13.11)
- Network Analysis [Hanbury] (27.11)
Exercise-related sessions
Review meetings for exercise 3. 18.12.2018, 14:00-18:00 (15 minutes for each group)
Project presentation. 22.1.2019 in Seminarraum Gödel, 11:00-15:00
<p>Ex1, Ex2: 1..100 points. Minimum 35.</p>
<p>Grade=0.25*Ex1+0.75*Ex2. Minimum 50.</p>