A.1 Git

This appendix provides a concise reference to essential Git concepts and commands, tailored for data analysts and researchers managing code and collaboration. For extended learning, explore the following resources:


A.1.1 Basic Setup

Configure your Git environment using the git config command:

  • Set your name and email (used in commits):

    git config --global user.name "Your Name"
    git config --global user.email "your.email@example.com"
  • Set your preferred text editor (e.g., for writing commit messages):

    git config --global core.editor "code --wait"  # VS Code

A.1.2 Creating a Repository

To create a new Git repository in your project directory:

git init

This creates a .git directory where Git stores all version control information.


A.1.3 Tracking Changes

Git tracks changes through a three-tier structure:

  • Working Directory: your local folder with files.
  • Staging Area: where you prepare changes before committing.
  • Local Repository: stores committed snapshots of your code.

Common commands:

  • Check status:

    git status
  • Add files to the staging area:

    git add filename
    git add .  # Add all changes
  • Commit staged changes:

    git commit -m "A brief message describing the change"

A.1.4 Viewing History and Changes

  • Show changes not yet staged:

    git diff
  • Show committed changes:

    git log
  • Restore previous versions of files:

    git checkout HEAD filename  # Restore last committed version
    git checkout <commit-id> filename  # Restore from specific commit

A.1.5 Ignoring Files

To prevent certain files from being tracked by Git, create a .gitignore file. For example:

# .gitignore
*.dat
results/
  • View contents using:

    cat .gitignore

A.1.6 Remote Repositories

Git supports linking local and remote repositories (e.g., GitHub):

  • Add a remote:

    git remote add origin https://github.com/yourname/repo.git
  • Push changes to remote:

    git push origin main  # or 'master' depending on default branch
  • Pull changes from remote:

    git pull origin main

A.1.7 Collaboration

  • Clone a remote repository:

    git clone https://github.com/username/repository.git

    This creates a local copy and sets up a remote named origin.


A.1.8 Branching and Merging

  • Create and switch to a new branch:

    git checkout -b new-branch-name
  • Switch back to main branch:

    git checkout main
  • Merge another branch into the current one:

    git merge feature-branch

A.1.9 Handling Conflicts

Merge conflicts occur when multiple changes affect the same lines of a file. Git will:

  • Mark the conflict in the file.
  • Require manual resolution before committing.

Always review and test code after resolving conflicts.


A.1.10 Licensing

Understanding software licensing is essential in open-source collaboration:

  • GPL (General Public License): Requires derivative software to also be GPL-licensed.
  • Creative Commons: Offers flexible combinations of attribution, sharing, and commercial use restrictions.

Choose licenses aligned with your intended use and contributions.


A.1.11 Citing Repositories

To guide citation practices:

  • Include a CITATION file in your repository.
  • Provide preferred citation formats (e.g., BibTeX, DOI).

This helps others acknowledge your work in academic or professional settings.