Class Description

The workshop will focus on knowledge extraction and discovery from data, using statistical tools and machine learning algorithms. The students will be required to design and implement such systems and present their results in class.

Meeting Schedule

# Date Class Details Lecturer Files
1 30/10/2016 Introduction: Intro to data science, project details, important Dates Daniel Deutch Slides
2 06/11/2016 Hands-On Data Science in Python : iPython,Jupyter Notebook, Numpy, Scipy, Scikit-Learn, Pandas Amit Somech Slides
Material (Notebook, data files)
3 11/12/2016 Preliminary Demo: Demo presentations by students, describing the outlines of the project (Problem formulation, tools and techniques, etc.)
4 01/01/2017 Status Meeting: Teams will report their current status of the project
5 26/01/2017 Final Project presentations (Thursday, 3-6pm): Final presentations: Problem,model,techniques,current results
6 13/03/2017 Projects Submission Deadline

Notifications

Date Notification
25/10/2016 Final project guidelines: here (PDF)
25/10/2016 Final project grading sheet: here (PDF)
30/10/2016 First task: (1) Send us details of your group members by 06/11/2016
(2) Browse through the data, and focus on a preliminary goal, directions and tools.
(3) Send us 3-5 slides describing (2) by 20/11/2016
(4) Ask to meet us if you're stuck.
30/11/2016 Preliminary demo guidelines:
  • Presentations will be in front of class, schdule will be published on December 8th
  • Each presentation is 10 minutes
  • Should include the following:
    • The goal of your project, and why is it interesting/important
    • The dataset segements you chose to deal with, and a short explanations of what they contain
    • A forumlation of a machine-learning problem (e.g "predict future GDP of first-world countries")
    • Problems you have encoutered so far (e.g missing data) and how do you intend to deal with them
    • An outline of your implementation plan

Resources

Resource URL
A list of tutorials collected by Kaggle https://www.kaggle.com/wiki/Tutorials
The official Kaggel DIY tutorial in Excel, Python and R https://www.kaggle.com/c/titanic
World Data Bank http://datacatalog.worldbank.org
A github account collected many resources regrding data science https://github.com/justmarkham/DAT4/blob/master/resources.md