Using Git

Dryad code is managed using Git and GitHub.

New to Git?
Here are some resources designed to explain Git and Github (as it relates to Dryad); these are targeted at people unfamiliar with version control.
 * Version Control (a presentation explaining how version control works)
 * Git setup for Dryad development
 * Git cheatsheet
 * Graphical overview of the typical workflow for Dryad development (aka github-flow)

Development process
Dryad maintains a fork of the DSpace codebase. We do not modify the DSpace classes directly; rather, we will continue to use the DSpace maven overlay structure. All code modifications will still be made in the "modules" directories. In the future, we may start modifying DSpace code "directly" in branches, rather than keeping everything in the modules directory.

The DSpace development workflow is a bit out of sync with the Dryad development workflow. Dryad works on an agile basis, deploying updates at least once a week. DSpace works on a more traditional schedule, with major releases once a year.

Dryad follows the GitHub-flow process. Github-flow has one overriding rule: the primary branch (in our case, "dryad-master") should always contain fully tested code that is ready to deploy on the production server. All modifications to the Dryad code are performed in separate branches. Once they have been reviewed, these branches are merged into dryad-master for deployment.

Branches that you should be aware of:
 * dryad-master: this is the branch used to deploy the production server. It is always well-tested and deployable. The production server is redeployed from this branch at least once a week.
 * dryad-dev: this branch is used to deploy the development server. It contains any code that is ready for testing.
 * dspace-master: this branch is periodically updated to keep up with primary DSpace development. We typically won't use it, but it is available for reference.
 * dspace-(version): these branches are updated to keep up with various DSpace releases. The dryad-master branch typically contains all content from one of these branches. When an update is made to our base DSpace version, we will usually merge the update in to the dryad-master branch.

Features are developed asynchronously. Each new feature is developed on its own branch. For testing purposes, a feature may be merged onto the dryad-dev branch. The updated dryad-dev branch will be automatically deployed to the development server. When a feature is complete and has been tested, the developer issues a pull request. The pull request is reviewed by another developer. After the feature passes review, the original developer merges it into the dryad-master branch.

This process allows us to push out each feature as it becomes stable. Features that are not yet stable can remain in their own branches without delaying the release of the features that are stable. Hopefully, this will also allow us to more easily migrate a single feature/branch to the core DSpace codebase when appropriate.

Typical workflow
1. Check out the dryad-master branch git checkout dryad-master

2. Create a new "feature" branch. The branch name should be based on a ticket number, for easy tracking. A branch based on FogBugz case 1234 should be called fogbugz-1234. A branch based on a Trello card 1234 should be called trello-1234. A branch based on an Atmire ticket 1234 should be called atmire-1234. Occasionally, there will be a branch that is not directly based on a ticket. These branches should have a descriptive human-readable name. git checkout -b fogbugz-1234

3. Implement a solution in the branch. Intermediate work can be committed to your local repository. For work that takes more than an hour, it is a good idea to periodically push the branch to GitHub. git add somefile.txt git commit git push origin fogbugz-1234

4. To test, you may want to merge your branch onto the dryad-dev branch. Jenkins will automatically deploy the updates to the development server. git checkout dryad-dev git merge fogbugz-1234 git push origin dryad-dev

5. Prepare to merge into master. When your feature is ready to be deployed, your topic branch will need to be merged into the master branch. To make sure this goes smoothly, you should first make sure all the latest changes on master have been merged into your topic branch. This allows you to deal with any conflicts that arise before asking someone else to test your code. First make sure your master branch is up-to-date:

git checkout master git pull origin master #if the remote repository is set up as the default you can probably just type 'git pull' git checkout new-feature git merge master You will then need to deal with any conflicts that arise and make commits, on your topic branch, to deal with them.

6. When the work is complete, go to GitHub and issue a pull request for the feature branch. CAUTION! By default, GitHub will set the pull request to go to the main DSpace codebase. Normally, you should change this to use the dryad-repo codebase.

7. Select another developer to review the pull request. Unless you have been instructed otherwise, assign the request to Ryan. The other developer may comment and/or request changes. Once the content of the branch is satisfactory, the other developer will post a comment approving the request.

8. Merge the pull request. We want the original developer to take responsibility for the merge process, because they are in the best position to make corrections if any issue comes up during the merge. In most cases, merging can be done from GitHub. In some cases you may need to merge from the command line. git checkout dryad-master git merge newfeature git push origin dryad-master

9. Add an entry to the Release Notes page that briefly describes the item. The note should link to appropriate documentation pages on the wiki.

10. Update your local copy of the code git checkout dryad-master git pull

Periodically, old branches will be removed. However, we do not remove them immediately after they are merged into master. This allows any bugs to be corrected on the branch that originally introduced them, resulting in a more coherent commit history.

Code Review
When you are assigned to review someone else's code, follow these steps:


 * 1) Read through the code to ensure it is:
 * 2) Legible
 * 3) Well-commented
 * 4) Free of any "code smells"
 * 5) Compile it on your local machine and test the new functionality. CAUTION! Even though you may locally merge the new code with dryad-dev to test for any integration problems, do not push the merged branch back to GitHub. Doing so will automatically close the pull request.
 * 6) Ensure the associated documentation is up to date.
 * 7) Verify that the change is accompanied by adequate testing code. Verify that the code passes the tests.
 * 8) Add your comments to the pull request on GitHub.
 * 9) If modifications are required, review those as well, until you and the coder agree that the code is ready to merge.
 * 10) Sign off on the pull request by posting a comment like "Ready to merge."
 * 11) Even if you (the reviewer) are satisfied with the proposed commits, do not merge the pull request—it is the requester's responsibility to review the comments and judge when to actually merge the pull request.

Markdown Checklist
A Github Markdown formatted checklist for the steps above:  - [ ] Read through the code to ensure it is: - [ ] Legible - [ ] Well-commented - [ ] Free of any "code smells" - [ ] Ensure the associated documentation is up to date. - [ ] Compile it on your local machine and test the new functionality. - [ ] Verify that the change is accompanied by adequate testing code. Verify that the code passes the tests.
 * 1) Code Review Checklist

Fetch an upstream pull request
A common code-review step is to pull the branch being submitted in a pull request to the local filesystem.

The commands below assume that the local dryad repository (dryad-repo.git) has the following repository set as its remote upstream branch in the .git/config file:

 [remote "upstream"] url = https://github.com/datadryad/dryad-repo.git fetch = +refs/heads/*:refs/remotes/upstream/*

The commands below use these values, taken from a pull request page:

 ID: pull request id # BRANCH: branch name for branch to merge

where the BRANCH is named as follows:

 {USER} wants to merge {N} commit into datadryad:dryad-master from {USER}:{BRANCH}

These commands will update the local repository with the upstream data about branches, then pull the branch associated with the given pull request, then checkout the branch for testing:

 git fetch upstream git fetch upstream pull/{ID}/head:{BRANCH} git checkout {BRANCH}

GitHub/FogBugz integration
When committing new code to git, you can include a FogBugz case number in the commit message, like "This commit fixes fogbugz 1234". The FogBugz ticket will be updated to include a link to that commit. However, there will not be a direct hyperlink from Github site to the FogBugz case.

Although we usually employ the "fogbugz 1234" syntax, the parser is relatively lenient. It will accept things as divergent as "Case:1234" and "BUGSID 1234". So don't worry about forgetting the correct syntax. Try to inclue the case number even if you forget which syntax to use.

Troubleshooting and advanced usage
See the NESCent Informatics hackday notes on using Git and correcting common Git problems.