Open source licensing at GitHub

No readers like this yet.
Two different business organization charts

Opensource.com

Ben Balter headshot, government at GitHubBen Balter is the Government Evangelist at GitHub—he's encouraging the use of open source philosophies in government entities. Prior to joining GitHub, he was a fellow in the Office of the U.S. Chief Information Officer within the executive office of the President, where he was instrumental in drafting the President's Digital Strategy and Open Data Policy, on the SoftWare Automation and Technology (SWAT) Team (the White House's first and only agile development team), and as a New Media Fellow in the Federal Communications Commission's Office of the Managing Director, where he played a central role in shaping the agency's reimagined web presence.

He's giving a talk at this year's OSCON called Open source licensing on GitHub by the numbers with his colleague Tal Niv. We reached out to him to give you a sneak peak of what his talk is about.

How important is open source licensing for GitHub and its users?

Without open source licenses, there'd be no such thing as open source. We'd simply have published code. Open source licenses move code from available to open. It's the freedom that underpins one's ability to use and modify software. It's a feature unique to open source, and it's what separates open source from all other software. While the particulars of which license you choose may not always be paramount, the fact that you license your code is.

Open source licensing is important to GitHub in two ways: First, as the host of the world's largest collection of code, we have a unique opportunity—and arguably an obligation based on that opportunity—to do what we can to support the open source community, and that obviously includes open source licensing. Second, as a company built on open source, it's important that the open source code we depend on and the code we contribute to the open source community are both properly licensed so that others can use it. After all, that's the point of open source.

What's the most common license users are using on GitHub?

The most common license by far is the MIT license, with just shy of half of all licensed repositories opting for the MIT. I think a big reason for that is that developers today are learning to code in a world that open source has already won. There's not the same friction with closed-source software that there once was, and as such developers often opt for practicality over purity. Developers contribute to open source because they want to build cool things, and they believe it's the best (only?) way to build software. The philosophical motivations, although still underpinning their contributions, aren't as in the forefront as they once were.

If you look at the MIT license, it's short and to the point. It tells downstream users what they can't do, it includes a copyright (authorship) notice, and it disclaims implied warranties (buyer beware). It's clearly a license optimized for developers. You don't need a law degree to understand it, and implementation is simple. Contrast that with something like the Apache license, which although nearly identical in terms of what you can and can't do is much more heavily "lawyered" and is significantly more verbose. Heck, the appendix alone, which explains how to apply the license, is longer than the entire MIT license. It's clearly a license optimized for lawyers.

The other popular license is the GPL family of licenses, which, as a copyleft (viral) license, is categorically different than licenses like MIT and Apache. You can see the relative breakdown of license usage in this blog post from March.

What is GitHub doing to reduce the number of unlicensed repositories?

We could theoretically modify our interface so that open source licensing was opt-out, rather than being opt-in as is it today, but the problem with open source licensing isn't that it's too hard to license a project, but that for many developers it's extremely intimidating and extremely confusing. That's why we created choosealicense.com to help demystify the process of properly licensing your project. Choosealicense.com itself is open source, and the Choosealicense repo is what powers the license dropdown that you see when you create a new repository. After we launched the website, we saw a significant uptick in the percentage of licensed repositories, both new and historic. It's clear that users want to do the right thing, but often lack the necessary resources.

We also created the License API to make it easier for consumers of open source, both large and small, to know under what license a project is licensed in a machine-readable or automated way. That means organizations concerned with license compliance can script a traditionally manual process, which should encourage unlicensed or improperly licensed projects to take a closer look at under what terms they're distributed. The license API also returns information about each individual license, including the license text and what downstream users can and can't do with the code, again, hoping to make it as easy as possible for users to participate within the open source community.

How does GitHub staff engage with its community?

Unlike many other companies, at GitHub there's often little difference (at least practically) between participating in the open source community in your personal capacity and participating as a GitHubber. GitHub's a company with a strong open source history, and when publishing, consuming, or maintaining an open source project, developers are expected to be full-fledged members of the open source community, again, just as they were hacking on something fun on the weekends. It's about participating in a broader conversation, about being human. It's not uncommon for developers at GitHub to bounce back and forth between open source and closed source projects, even multiple times a day. The best way to "engage" a community is to actually be part of it.

One such example of this, as part of our open source culture, every other Friday, GitHubbers (from developers to accountants) are encouraged to take a break from their day-to-day jobs to spend the day contributing to open source. It could be learning a new open source project that you haven't used before, triaging issues, contributing to documentation, or of course coding, but the point is, you're contributing to the open source community. We also have an open source team internally, dedicated to helping GitHubbers create, release, and maintain open source software.

Last, I'd be remised if I didn't mention GitHub's pricing model. Everything you do on GitHub is completely free, as long as it's open source. The only time you pay for project hosting is when it's a closed source project, which you may presumably draw a profit from yourself. In a way, you could say GitHub "taxes" proprietary software in order to support open source. There can be many explanations for this, but the one that resonates with me most as a software developer is the idea that there's tremendous value to all the open source software that's been created in the last decade or so, and if we lost some of that because the maintainers couldn't or weren't interested in paying hosting costs in perpetuity, that'd be a part of the software development cannon gone forever.

In this way, you could say GitHub is built on open source, contributes to open source, and supports the open source community.

How often do you (as GitHub employees) contribute to repositories hosted on your service?

I used this quick script to take a look at the open source contributions by GitHub employees during a typical week as well as community contributions from GitHub-maintained projects. In this case, this is a peek at the week starting June 15, 2015. During that particular week, there were:

  • 33 new open source repositories created
  • 3 repositories went from closed source to open source
  • 1,173 issue comments
  • 118 issues opened
  • 316 issues closed
  • 137 pull requests opened
  • 193 pull requests merged
  • 37 releases
  • 970 pushes
  • 4,857 commits
  • For a total of 3,262 open source "events" (issues, comments, pull requests, etc.)

There are also many GitHub-maintained projects, and projects maintained by or heavily contributed to by GitHubbers, including Atom, Electron, Git LFS, Homebrew, Bootstrap, Hubot, Octokit, Git, and Linguist that see hundreds of issues, commits, comments, and pull requests form GitHubbers each week.

OSCON
Speaker Interview

This article is part of the Speaker Interview Series for OSCON 2015. OSCON is everything open source—the full stack, with all of the languages, tools, frameworks, and best practices that you use in your work every day. OSCON 2015 will be held July 20-24 in Portland, Oregon..

Aleksandar Todorović
I'm a part of the tech department for an awesome investigative journalism network called OCCRP. I'm really passionate about open source software, artificial intelligence and information security. My open source contributions are now merged with projects like reddit, elementary OS and the Tor Project. I'm running a personal blog where I share my personal stories.

2 Comments

Copyleft licenses are not "viral". This is a false and misleading label. No one who uses GPL'd software does so against their own wishes. I would be nice if the good folks at opensource.com would correct this kind of thing.

GPL is viral. I've released an open source software that I would have liked to use MIT license for but we were forced to license it in GPL because our software linked to a GPL-licensed linear optimization package (GLPK).

In reply to by Wayne Maniq (not verified)

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.