Class Description

The workshop will focus on knowledge extraction and discovery from data, using statistical tools and machine learning algorithms. The students will be required to design and implement such systems and present their results in class.

Meeting Schedule

# Date Class Details Lecturer Files
1 27/10/2019 Introduction: Intro to data science, project details, important Dates Daniel Deutch Slides
2 03/11/2019 Hands-On Data Science in Python : Jupyter Notebook, Numpy, Scipy, Scikit-Learn, Pandas Amit Somech Slides
Material (Notebook, data files)
3 22/12/2019 Student Presentations #1: 5 minutes, 5 slides - Presentation of initials results
5 26/01/2020 Student Presentations #2: (Almost) Final presentations: Problem, model, techniques, results
6 01/03/2020 Projects Submission Deadline

Notifications

Date Notification
23/10/2019 First task: (1) Send us by email your team members (names & emails) NO LATER THAN 10/11/2019
(2) After you form a team, choose a dataset from here
31/10/2019 Second task: (1) Submit (email) a self-contained presentation (no longer than 15 slides) that describes your dataset of choice, some results of initial analysis (+visualizations), and project problem formulation. If exist, review what has been done in previous work (academic papers or data science notebooks) and how is your project different.
(2) Last submission date is 01/12/19 .
(3) After submitting, we will send back comments and potentially some followup questions. Some teams may be encouraged to attend an office hour (although all teams/individuals are welcome to schedule an office hour ).
Important: Team must receive a written confirmation regarding the dataset and problem. Also, any change in an already approved dataset/problem needs to be approved again.
01/11/2019 Final project guidelines: here (PDF)

Course Grading

Resources

Resource URL
A list of tutorials collected by Kaggle https://www.kaggle.com/wiki/Tutorials
The official Kaggel DIY tutorial in Excel, Python and R https://www.kaggle.com/c/titanic
World Data Bank http://datacatalog.worldbank.org
A github account collected many resources regrding data science https://github.com/justmarkham/DAT4/blob/master/resources.md