Subsubsection 4.6.1.1 Forking Projects
If you want to contribute to an existing project to which you don’t have write access (aka push access), you can
fork the project. When you “fork” a project, GitHub will make a copy of the project that is entirely yours; it lives in your namespace, and you can push to it.
In GitHub, a “fork” is simply a copy of the project in your own namespace, allowing you to make changes to a project publicly as a way to contribute in an open manner. Note: Historically, the term “fork” has sometimes had a negative connotation if it meant that someone took an open source project in a different direction, creating a competing project and splitting the contributors.
This way, projects don’t have to worry about adding users as collaborators to give them push access. People can fork a project, push to it, and contribute their changes back to the original repository by creating a merge request which in GitHub is called a
pull request. We’ll cover this next. Making a pull request opens up a discussion thread with code review, and the owner and the contributor can then communicate about the change until the owner is happy with it, at which point the owner can merge it in.
To fork a project, visit the project page and click the “Fork” button at the top-right of the page. This gives you a copy of the project in the GitHub cloud.
Once you have your own fork on GitHub, you need to
clone a local copy down to a place where you can edit it, most typically your own local computer.
Note that while you can make small changes in GitHub, it is not a good practice to do so.
Checkpoint 4.6.2. Exercise – Fork a Repo.
Go to
GitHub: Fork a Repo and complete the provided exercise with the octocat/Spoon-Knife repository.
Subsubsection 4.6.1.3 Creating a Pull Request
You can always go to the “Branches” page at
https://github.com/<user>/<project>/branches
to locate your branch and open a new Pull Request from there.
You can also see a list of the commits in our topic branch that are “ahead” of the
main
or
master
branch (in this case, just the one) and a unified diff of all the changes that will be made should this branch get merged by the project owner.
When you hit the ’Create pull request’ button on this screen, the owner of the project you forked will get a notification that someone is suggesting a change and will link to a page that has all of this information on it.
Note: Though Pull Requests are used commonly for public projects when the contributor has a complete change ready to be made, it’s also often used in internal projects
at the beginning of the development cycle. Since you can keep pushing to the topic branch even
after the Pull Request is opened, it’s often opened early and used as a way to iterate on work as a team within a context, rather than opened at the very end of the process.
Checkpoint 4.6.4. Introduction to GitHub.
Introduction to GitHub
Subsubsection 4.6.1.4 Iterating on a Pull Request
At this point, the project owner can look at the suggested change and merge it, reject it or comment on it. Let’s say that he likes the idea, but would prefer a slightly longer time for the light to be off than on.
Where this conversation may take place varies by community, on GitHub this typically happens online. The project owner can review the unified diff and leave a comment by clicking on any of the lines.
Once the maintainer makes a comment, the person who opened the Pull Request (and indeed, anyone else watching the repository) will get a notification.
Note that anyone can also leave general comments on the Pull Request.
Now the contributor can see what they need to do in order to get their change accepted. Luckily this is very straightforward. With GitHub you simply commit to the same topic branch again and push, which will automatically update the Pull Request.
Adding commits to an existing Pull Request does not trigger a notification, so once you push corrections, you might want to leave a comment to inform the project owner that you made the requested change.
An interesting thing to notice is that if you click on the “Files Changed” tab on any Pull Request, you’ll get a “unified” diff — that is, the total aggregate difference that would be introduced the main branch if this topic branch was merged in. In
git diff
terms, it basically automatically shows you
git diff main<branch>
for the branch this Pull Request is based on.
GitHub always checks to see if the Pull Request merges cleanly and provides a button to do the merge for you on the server. This button only shows up if you have write access to the repository and a trivial merge is possible. If you click it GitHub will perform a “non-fast-forward” merge, meaning that even if the merge
could be a fast-forward, it will still create a merge commit.
If you prefer, you can simply pull the branch down and merge it locally. If you merge this branch into the
main
branch and push it to GitHub, the Pull Request will automatically be closed.
This is the basic workflow that most GitHub projects use. Topic branches are created, Pull Requests are opened on them, a discussion ensues, possibly more work is done on the branch and eventually the request is either closed or merged.
Note: It’s important to note that you can also open a Pull Request between two branches in the same repository. If you’re working on a feature with someone and you both have write access to the project, you can push a topic branch to the repository and open a Pull Request on it to the
main
branch of that same project to initiate the code review and discussion process. No forking necessary.
Subsubsection 4.6.1.6 Pull Requests as Patches
It’s important to understand that many projects don’t really think of Pull Requests as queues of perfect patches that should apply cleanly in order, as most mailing list-based projects think of patch series contributions. Most GitHub projects think about Pull Request branches as iterative conversations around a proposed change, culminating in a unified diff that is applied by merging.
This is an important distinction, because generally the change is suggested before the code is thought to be perfect. This depends wholely on the community. In communities where it is used, it enables an earlier conversation with the maintainers so that arriving at the proper solution is more of a community effort. When code is proposed with a Pull Request and the maintainers or community suggest a change, the patch series is generally not re-rolled, but instead the difference is pushed as a new commit to the branch, moving the conversation forward with the context of the previous work intact.
This way if you go back and look at this Pull Request in the future, you can easily find all of the context of why decisions were made. Pushing the “Merge” button on the site purposefully creates a merge commit that references the Pull Request so that it’s easy to go back and research the original conversation if necessary.
Subsubsection 4.6.1.7 Keeping up with Upstream
If your Pull Request becomes out of date or otherwise doesn’t merge cleanly, you will want to fix it so the maintainer can easily merge it. GitHub will test this for you and let you know at the bottom of every Pull Request if the merge is trivial or not.
If your Pull Request does not merge cleanly you’ll want to fix your branch so that it turns green.
You have two main options in order to do this. You can either rebase your branch on top of whatever the target branch is (normally the
main
branch of the repository you forked), or you can merge the target branch into your branch.
Most developers on GitHub will choose to do the latter, for the same reasons we just went over in the previous section. What matters is the history and the final merge, so rebasing isn’t getting you much other than a slightly cleaner history and in return is
far more difficult and error prone.
If you want to merge in the target branch to make your Pull Request mergeable, you would add the original repository as a new remote, fetch from it, merge the main branch of that repository into your topic branch, fix any issues and finally push it back up to the same branch you opened the Pull Request on.
Add the original repository as a remote named upstream
.
Fetch the newest work from that remote.
Merge the main branch of that repository into your topic branch.
Fix the conflict that occurred.
Push back up to the same topic branch.
Once you do that, the Pull Request will be automatically updated and re-checked to see if it merges cleanly.
One of the great things about Git is that you can do that continuously. If you have a very long-running project, you can easily merge from the target branch over and over again and only have to deal with conflicts that have arisen since the last time that you merged, making the process very manageable.
If you absolutely wish to rebase the branch to clean it up, you can certainly do so, but it is highly encouraged to not force push over the branch that the Pull Request is already opened on. If other people have pulled it down and done more work on it, you will run into major problems! Instead, push the rebased branch to a new branch on GitHub and open a brand new Pull Request referencing the old one, then close the original.
Subsubsection 4.6.1.8 References
Your next question may be “How do I reference the old Pull Request?”. It turns out there are many, many ways to reference other things almost anywhere you can write in GitHub.
Let’s start with how to cross-reference another Pull Request or an Issue. All Pull Requests and Issues are assigned numbers and they are unique within the project. For example, you can’t have Pull Request #3
and Issue #3. If you want to reference any Pull Request or Issue from any other one, you can simply put
#<num>
in any comment or description. You can also be more specific if the Issue or Pull request lives somewhere else; write
username#<num>
if you’re referring to an Issue or Pull Request in a fork of the repository you’re in, or
username/repo#<num>
to reference something in another repository.
In addition to issue numbers, you can also reference a specific commit by SHA-1. You have to specify a full 40 character SHA-1, but if GitHub sees that in a comment, it will link directly to the commit. Again, you can reference commits in forks or other repositories in the same way you did with issues.
Subsubsection 4.6.1.10 Task Lists
A useful GitHub specific Markdown feature, especially for use in Pull Requests, is the
Task List. A
task list is a list of checkboxes of things you want to get done. Putting them into an Issue or Pull Request normally indicates things that you want to get done before you consider the item complete.
These are often used in Pull Requests to indicate what all you would like to get done on the branch before the Pull Request will be ready to merge. The really cool part is that you can simply click the checkboxes to update the comment — you don’t have to edit the Markdown directly to check tasks off.
What’s more, GitHub will look for task lists in your Issues and Pull Requests and show them as metadata on the pages that list them out. For example, if you have a Pull Request with tasks and you look at the overview page of all Pull Requests, you can see how far done it is. This helps people break down Pull Requests into subtasks and helps other people track the progress of the branch.
These are incredibly useful when you open a Pull Request early and use it to track your progress through the implementation of the feature.
Subsubsection 4.6.1.12 Quoting
If you’re responding to a small part of a long comment, you can selectively quote out of the other comment by preceding the lines with the
>
character. In fact, this is so common and so useful that there is a keyboard shortcut for it. If you highlight text in a comment that you want to directly reply to and hit the
r
key, it will quote that text in the comment box for you.
The quotes look something like this:
> Whether 'tis Nobler in the mind to suffer
> The Slings and Arrows of outrageous Fortune,
How big are these slings and in particular, these arrows?