Syllabus
- CRN: 7715
- Course: INSY336
- Title: Data Handling and Coding for Analytics
- Section: 002
- Term: Fall 2024
- Instructor: Kyunghee Lee (kyunghee.lee@mcgill.ca)
- TA: Dugmee Hwang (dugmee.hwang@mail.mcgill.ca)
- Class meetings: Tuesday 2:35 ~ 5:25 (BRONF 205)
- Office hour: TBA
This course introduces students to the foundations of data handling and coding using Python, with an emphasis on applications in analytics. Students will learn how to manage, analyze, and visualize data using Python libraries and SQL, integrating these skills into a complete data workflow.
Learning goals
By the end of this course, students will be able to:
- Understand and apply fundamental programming concepts in Python.
- Utilize Python libraries to read, manipulate, and visualize data.
- Perform basic data cleaning, wrangling, and analysis tasks.
- Use SQL for managing data in relational databases.
- Implement an end-to-end Extract-Transform-Load (ETL) workflow.
Textbooks
These textbooks are freely available online. You can purchase a hardcopy if you prefer.
Tools
In this course, we will leverage a diverse set of tools to ensure an engaging and effective learning experience. We will dive into Python programming through EdStem
, a cloud-based platform that offers a seamless environment for writing and executing Python
code. Furthermore, we will explore data manipulation and persistence using SQLite
, a lightweight database engine that integrates smoothly with Python. Additionally, we will utilize DataCamp
, an interactive learning platform that offers tailored exercises and challenges to reinforce the course’s theoretical content. Through the integration of these varied tools, students will gain a comprehensive understanding of the subject matter, enabling them to apply the skills learned in real-world scenarios, all within an accessible and supportive learning environment.
Evaluation
Participation
Your participation grade is based on attendance, active engagement in class discussions, and professionalism. Consistent attendance and active involvement in class activities, including asking and answering questions, are key factors in earning participation points. Note that disruptive behavior or lack of engagement may negatively impact this portion of your grade.
In-Class exercise
Each week, lab materials will be made available to complement the lecture topics, serving as a practical platform for hands-on coding practice. These labs feature guided tutorials that walk students through coding tasks within an interactive environment. In addition to the lab tutorials, each session will incorporate in-class exercises. These exercises consist of straightforward coding challenges intended for completion during class time. Students are expected to actively engage with these in-class exercises and complete them alongside the lab materials. These exercises won’t be graded.
Take-Home exercise (EX)
To reinforce the skills and concepts taught in class, students will be given access to specially designed exercises hosted on DataCamp, a leading online platform for learning coding. These take-home exercises include both tutorials and interactive coding challenges, allowing for thorough practice in a self-paced environment. Students may attempt these exercises multiple times up to the specified due date, and will receive a pass/fail grade based on their completion status.
Take-Home exercise | Topic |
---|---|
EX1 (2.5) | Functions, variables |
EX2 (2.5) | Control |
EX3 (2.5) | SQL |
EX4 (2.5) | Data manipulation |
Assignment (ASMT)
Assignments serve as a cornerstone of the course, offering students an opportunity to apply and integrate their newly acquired skills in a real-world context. Each assignment will focus on distinct programming tasks that span the topics covered in class, progressing from individual techniques to comprehensive, end-to-end Extract-Transform-Load (ETL) processes. Students will be tasked with retrieving data from APIs, cleaning and transforming this data, and subsequently loading it into a SQL database. The assignments are designed to be sequential, building upon the outputs of of prior assignments.
Assignment | Topic |
---|---|
ASMT1 (5) | Create a Python program to fetch and process data from a web API. |
ASMT2 (10) | Execute SQL queries to design, populate, and interact with a database. |
ASMT3 (10) | Construct a Python program to complete an ETL process: extract data from a web API, transform it, and load it into a SQL database. |
Quiz
The quizzes aim to assess students’ understanding of the course material and their ability to apply learned concepts within a timed environment. These assessments will cover crucial aspects of programming, from foundational skills to more specialized topics.
Exam | Topic |
---|---|
Quiz 1 (15) | Fundamentals of Programming in Python |
Quiz 2 (15) | Data Manipulation and Loading in SQL |
Each quiz will feature a mix of multiple-choice questions, short-answer queries, and hands-on coding challenges, all administered in class within a fixed timeframe. Students are encouraged to actively engage with course materials and in-class exercises as part of their preparation for these assessments.
Final exam
The final exam will cover all the material discussed throughout the course. It will include multiple-choice questions, short-answer questions, and coding challenges. The exam aims to assess your overall understanding on ETL Processes using Python and SQL and ability to apply the concepts.
Grading Structure
Your final grade for the course will be calculated based on the cumulative points you’ve earned from different assessments. These points will then be used to determine your final letter grade according to the university’s grading scale. Here’s a breakdown of how each assessment component contributes to your final grade:
Component | Weight (%) |
---|---|
Participation | 5 |
Take-Home Exercise | 10 |
Assignment | 25 |
Quiz | 30 |
Final Exam | 30 |
To translate these percentages into letter grades and grade points, the following grading scale will be applied:
Grades | Grade Points | Numerical Scale of Grades |
---|---|---|
A | 4.0 | 85 – 100% |
A- | 3.7 | 80 – 84% |
B+ | 3.3 | 75 – 79% |
B | 3.0 | 70 – 74% |
B- | 2.7 | 65 – 69% |
C+ | 2.3 | 60 – 64% |
C | 2.0 | 55 – 59% |
D | 1.0 | 50 – 54% |
F (Fail) | 0 | 0 – 49% |
Refer to McGill Grading and Credit Policy for details.
Late Submission
All assignments and exercises are to be submitted by the specified due date and time. **A late submission will incur a penalty of 10% of the total possible marks for each day (or part thereof) past the due date.** The late penalty will be strictly enforced to ensure fairness to all students. Exceptions to this policy may be considered if the nature of the cause for late submission is highly unusual or beyond the student’s control. In such cases, students must contact the instructor as soon as possible to discuss the situation. Supporting documentation may be required.
Exam Deferral
Exams must be taken as scheduled. Deferrals will not be granted except in rare and exceptional circumstances that prevent a student from taking the exam at the designated time. Such situations must be discussed with the instructor at the earliest opportunity, and proper documentation may be required to substantiate the request.
Please note that failure to adhere to these policies or to communicate promptly with the instructor may result in a loss of marks or other academic penalties. It is the student’s responsibility to be aware of and understand these policies and to act in accordance with them.
Schedule
Week | Date | Title | Topic | EX* | ASMT** |
---|---|---|---|---|---|
1 | 9/3 | Introduction to Data Analytics | fundamentals of programming | ||
2 | 9/10 | Representation | functions, variables | EX1 | |
3 | 9/17 | Conditionals | operators, boolean, if | EX2 | |
4 | 9/24 | Loops | while, for, list, dict, json | ||
5 | 10/1 | Quiz 1 | |||
6 | 10/8 | Basic SQL | select, filter, sort | EX3 | |
7 | 10/22 | Advanced SQL | aggregate, join, create, insert | ASMT1 | |
8 | 10/29 | SQL in Python | create, insert; DB-API | ||
9 | 11/5 | Quiz 2 | |||
10 | 11/12 | ETL | API, JSON, extract, transform, load | ASMT2 | |
11 | 11/19 | Data Manipulation with Pandas | numpy, pandas, query | EX4 | |
12 | 11/26 | ETL and Data Exploration | ASMT3 | ||
13 | 12/3 | Review and Q&A | |||
14 | TBA | Final exam |
Due dates
- Due in one week (EX)
- Due in two weeks (ASMT)
Academic integrity
McGill University values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (Approved by Senate on 29 January 2003) (See McGill’s guide to academic honesty for more information).