Project Management
Contents
3. Project Management¶
3.1. Reproducible Data Science Project¶
3.1.1. Using Virtual Environment for Python¶
It is a good practice to set up and use virtual environments for Python (or R) projects. See a tutorial of virtual environments at Python Docs.
3.1.2. Using Jupyter Notebook¶
File name extension
.ipynb
Separate code chunks and companion texts
Test and edit until the whole notebook runs as expected
Download the notebook into a pdf file (for GitHub project release)
3.1.3. Using python
Engine in RMarkdown
¶
Find a Markdown cheatsheet
Install R
reticulate
packageSee example in
hw-rmkdn.Rmd
in the repo
3.2. Setting up a Git Repo¶
See documentations at GitHub Docs.
Demonstration with homework template
3.3. Styles¶
3.3.1. Programming¶
Naming
file/folder
variables
functions
modules/packages
Spacing
Indentation
Google code recommendations: https://code.google.com/archive/p/soc/wikis/PythonStyleGuide.wiki
3.3.2. Git Repo¶
Frequent commit (more snapshots)
Informative message
Keep it clean (no temporary or generated files)
Make it reproducible (e.g., relative path)