This lecture covers the basic programming approaches in Data Science. The emphasis is on computational thinking, the formulation of problems and their solution spaces so that a computer can solve them. Methods for increasing the efficiency of the solutions are also presented. Use cases demonstrate the practical application of data science solutions.
The following topics are covered in the lectures:
- Introduction to Data-Oriented Programming Paradigms
- Python
- SciPy, NumPy, vectorisation, execution performance measurement
- Data preparation, structuring, fusion with Pandas
- Data Science solution approaches and case studies
- Introduction to machine learning
- Introduction to network analysis
The link to the online lectures is on TUWEL.
Syllabus
All Lectures on Tuesday 12:00 c.t.-13:45.
- Kickoff-Session, data science process, community, solution examples, Python introduction [Hanbury] (6.10.2020)
- Introduction to DOPP [Hanbury] (13.10.2020)
- SciPy, NumPy, vectorisation, visualisation, benchmarking [Piroi] (27.10.2020)
- Preprocessing, Pandas [Piroi] (3.11.2020)
- Intro to Machine Learning [Hanbury] (17.11.2020)
- Network Analysis [Hanbury] (1.12.2020)
Exercise-related sessions
Review meetings for exercise 3 (15 minutes for each group):
- 15.12.2020, 9:00-13:00
- 16.12.2020, 9:00-11:00 and 13:00-15:00
Project presentation: 27.1.2020, 9:00-16:00
The effort breakdown is:
Python tutorial: 4hLectures: 7 sessions @ 2h: 14hExercises: EX1 (data wrangling): 5h EX2 (pandas + sklearn): 10h EX3 (project): 42h [includes review meeting (topic + questions + work plan)]SUM: 75h
<p>Three practical exercises. The third exercise requires a report, Jupyter Notebook, and presentation of the results.</p>