SPRING 2021 PHY/MAT 231 How to Think Like a Data Scientist

Page created by Johnny Gill
 
CONTINUE READING
SPRING 2021 PHY/MAT 231 How to Think Like a Data Scientist
PHY/MAT 231 How to Think Like a Data Scientist
                        SPRING 2021
                         C. De Pree

                   Professor: C. De Pree, cdepree@agnesscott.edu

        Meeting Time: TTh 3:40-4:55 PM (first meeting, Tuesday, Jan. 19, 2021)

                Office Hours: MW 1-2 PM (other times by appointment)

Course Description
This course introduces students to the importance of gathering, cleaning, normalizing,
visualizing and analyzing data to drive informed decision-making, no matter the field of
study. Students will learn to use a combination of tools and techniques, including
spreadsheets, SQL and Python to work on real-world datasets using a combination of
procedural and basic machine learning algorithms. They will also learn to ask good,
exploratory questions and develop metrics to come up with a well-thought-out analysis.
Presenting and discussing an analysis of datasets chosen by the students will be an
important part of the course. Like PHY/MAT 131, this course will be “flipped,” with
content learned outside of class and classroom time focused on hands-on,
collaborative projects.

Course Goals
Students in this course will learn how to:
1. Locate, compare and select online sources of data for analysis
2. Create algorithms and programs that clean, normalize and visualize data
3. Plan basic machine learning algorithms
4. Assess real world datasets using spreadsheets, SQL, and Python

Textbook/Reading
Python: Online Text Only (Runestone: How to Think Like a Data Scientist)
Weapons of Math Destruction, Cathy O’Neil
Students will be provided with a login during the first class. If students took PHY/MAT
131 in Fall 2020, then they should use the same login/password combination from
that class, and add the class ‘asc_phy231s21a’.

Supplementary online reading is noted as appropriate in the text.

Evaluation
Grades will be determined as follows:

Type of Assignment                                   Percentage
Reading/Attendance                                   25%
In-class Work/Participation                          25%
Midterm presentation                                 25%
Final notebook/presentation                          25%
--
Total                                                100%

Software
The textbook for this course is online, and is housed on the Runestone site. If you took
PHY/MAT 131 last semester, you will simply need to add in this course
(asc_phy231s21a). If you did not, you will be assigned a login that will allow you to take
this course. Early in the semester, we will be using Google Sheets for our data analysis.
Later in the semester, we will use Jupyter Lab, an open source software that will be
installed in G-15, and can also be installed on your personal computer.

Getting Help
Programming and data science can be fun and rewarding, but can also be tiring and
difficult. Think of this as a conversational class in a foreign language – they are very
similar!

To help alleviate the difficulties that can arise, make sure to utilize the following
resources:
   1. Forums on Canvas will be available for you to post questions regarding your
       code, error messages, and course content. Please do not email me with these
       types of questions as a first recourse- often many students have the same
       question and it is much easier to just answer it once. You are strongly
       encouraged to respond to each others posts.
   2. Similarly, we will set up a Slack Channel early in the semester for
       discussion/questions
   3. I will available during office hours, (MW 10-11 AM) and at other times by
       appointment

Course Policies
Classroom courtesy. All students contribute to a classroom environment of mutual
respect. This means arriving before class begins, avoiding disruptive behaviors,
approaching teamwork diligently, listening attentively, and responding actively to one
another’s ideas.

Attendance and engagement. We will have two meetings each week. Tuesday will be
our “synchronous” session, and Thursday our “asynchronous” session. Attendance is
required on Tuesdays, and optional on Thursdays. You can miss 2 Tuesday sessions
without penalty, but after that, each absence will result in a 10% reduction in your
attendance grade.

Submission of assignments. Assignments are due before or at the beginning of class
on the given deadline. Late assignments will receive a deduction of 1/3 of a letter grade
off the final grade for each day they are late (in 24-hour increments, starting at the
beginning of class), including weekend days and starting at the beginning of the class
on which they are due.

Inclusion. This course adheres to the principles of diversity and inclusion integral to the
Agnes Scott community. We respect people from all backgrounds and recognize the
differences among our students, including racial and ethnic identities, religious
practices, and gender expressions. We strive for our campus to be a safe space in
which all students feel acknowledged and supported. At the same time, we understand
that course content, critical inquiry, and classroom dialogues give us opportunities to
examine topics from a variety of perspectives. Such discourse is a defining feature of a
liberal arts education, and can compel debates that challenge beliefs and positions,
sometimes causing discomfort, especially around issues related to personal identities.
While we uphold and preserve the tenets of academic freedom, we request and invite
your thoughtful and constructive feedback on ways that we can, as a community of
learners, respectfully assist and challenge one another in our individual and collective
academic work.

College Policies

The Honor Code and Academic Honesty. The Agnes Scott College honor code
embodies an ideal of character, conduct, and citizenship, and is an important part of
the College’s mission and core identity. This applies especially to academic honesty
and integrity. Passing off someone else’s work as your own represents intellectual fraud
and theft, and violates the core values of our academic community. To be honorable,
you should understand not only what counts as academic dishonesty, but also how to
avoid engaging in these practices. You should:
       • review each course syllabus for the professor’s expectations regarding course
       work and class attendance.
       • attribute all ideas taken from other sources; this shows respect for other
       scholars. Plagiarism can include portraying another’s work or ideas as your own,
       buying a paper online and turning it in as if it were your own work, or not citing
       or improperly citing references on a reference page or within the text of a paper.
• not falsify or create data and resources or alter a graded work without the prior
       consent of your professor. This includes making up a reference for a works cited
       page or making up statistics or facts for academic work.
       • not allow another party to do your work/exam, or submit the same or similar
       work in more than one course without permission from the course instructors.
       Cheating also includes taking an exam for another person, looking on another
       person’s exam for answers, using exams from previous classes without
       permission, or bringing and using unauthorized notes or resources (i.e.,
       electronic, written, or otherwise) during an exam.
       • not facilitate cheating, which can happen when you help another student
       complete a take home exam, give answers to an exam, talk about an exam with
       a student who has not taken it, or collaborate with others on work that is
       supposed to be completed independently.
       • be truthful about the submission of work, which includes the time of
       submission and the place of submission (e.g., e-mail, online, in a mailbox, to an
       office, etc.).

Because of the centrality of the Honor Code to our campus life, penalties result from
dishonest conduct. In academic courses, these penalties can range from failure of the
assignment to expulsion from the college. You should speak with your professors if you
need clarification about any of these policies.

Modified Pledge
Students pledge that they have completed assignments honestly by attaching the
following statement to each test, quiz, paper, overnight assignment, in-class essay, or
other work. Please remember that the Honor Code governs your work at this college.
Please remember to pledge all work in the course.

Here is a statement that you can write/type on your assignments: I pledge that I have
neither given nor received any unauthorized aid on this assignment. (Signed)
_________________________________________

Course statement on the Honor Code and Plagiarism: In this class and this college, we
must adhere to the principles of honesty and fair use. As we seek to engage and
understand the scholarship relevant to our topics and to add our own ideas, it is
imperative that we know how to credit those who have already contributed to the
discussions and to clarify when we are inserting our own thoughtful responses to and
extensions of this knowledge.

Plagiarism has become especially tricky as we are all accustomed, especially when
using the web, to getting information quickly and, in some cases, without considering
its author. In all cases, if you have not generated the content, you must give full credit
to your source in written assignments and oral presentations. Keep in mind that citing
sources makes your work more honest and more credible by showing that you
understand the intellectual conversation and acknowledge its participants.
Title IX
Title IX provision: “No person in the United States shall, on the basis of sex, be
excluded from participation in, be denied the benefits of, or be subjected to
discrimination under any education program or activity receiving Federal financial
assistance.”

For the safety of the entire community, if you have experienced or have any information
about sexual misconduct, the college strongly urges you to make a report immediately
to Deputy Title IX Coordinator Karen Gilbert (kgilbert@agnesscott.edu, 404-471-6435),
or Vice President for Student Life and Dean of Students, Karen Goff
(kgoff@agnesscott.edu, 404-471-6499). ADA: Agnes Scott College seeks to provide
equal access to its programs, services and activities for people with various abilities. If
you will need accommodations in this class, please contact Rashad Morgan
(rmorgan@agnesscott.edu) in the Office of Academic Advising and Accessible
Education (404-471-6150) to complete the registration process. Once registered, please
contact me so we can discuss the specific accommodations needed for this course.

Course evaluations.
Near the end of the semester, you will be asked via e-mail to complete online course
evaluations online. These are reported anonymously and remain unavailable to faculty
until after the semester has ended and grades have been submitted. Your feedback
plays a significant role in shaping our teaching and the LDR courses, generally. Please
help us by completing your evaluation thoughtfully and constructively.
The course schedule is sketched in below with important dates for presentations listed.
The Canvas version of the schedule is the one that takes precedence, and will be kept
up to date. This schedule is set only up until Peak Week so that we can see how we are
doing and assess at that point.

      WEEK           DATE In Class                           Reading/Assignments
                                                             (finish before class)
           1      T-Jan 19 Introduction to the
                           course/Overview
                           Collecting Class Data-1
                 Th-Jan 21 Collecting Class Data-2           Read Chapter 1
           2      T-Jan 26 Happiness Data - 1                Read Chapter 2 (2.1-2.2)
                 Th-Jan 28 Happiness Data - 2                Read Chapter 2 (2.3-2.4)
           3       T-Feb 2 Happiness Data - 3                Read Chapter 2 (2.5)
                  Th-Feb 4 WMD Discussion                    Read WMD (1-3)
                                                             Finalize choice of midterm dataset
           4      T-Feb 9    Python Review                   Read 4.1-4.2
                Th-Feb 11    Installing Jupyter Notebooks    Read 4.3-4.4
           5     T-Feb 16    Pandas - 1                      Read Chapter 5
                Th-Feb 18    Pandas - 2                      Read WMD (4-6)

           6      T-Feb 23 Data Ethics                       Read Chapter 6
                Th-Feb 25 Working with Midterm
                           Dataset
           7       T-Mar 2 Midterm Presentations - 1         N/A Presentations
                 Th-Mar 4 Midterm Presentations - 2          N/A Presentations
                   T-Mar 9 No Class – Peak Week
                Th-Mar 11 No Class – Peak Week
           8     T-Mar 16 No Class – Spring Break
                Th-Mar 18
           9     T-Mar 23
                Th-Mar 25
          10     T-Mar 30
                 Th-Apr 1
          11       T-Apr 6
                 Th-Apr 8
          12      T-Apr 13
                Th-Apr 15
          13      T-Apr 20
                Th-Apr 22
          14      T-Apr 27 No Class - SpARC
                Th-Apr 29                                    Final Project Presentations
          15      T-May 4 Last Day of Class                  Final Project Presentations
You can also read