Press "Enter" to skip to content

Airbyte: The Money and Licensing of Software Plumbing

The San Francisco-based startup that’s caught the eyes of VCs, to the tune of $181.2 million invested in less than a year, is also licensing its platform under the fauxpen source Elastic License.

Airbyte co-founders
Airbyte co-founders John Lafleur, COO (left) and Michel Tricot, CEO (right) on a rooftop in San Francisco where the company is headquartered. SOURCE: Airbyte

AirByte, a basically open source startup that specializes in connectors for data channels, caught the attention of venture capitalists in a big way in 2021, with three funding rounds completed during the year that raised a total of $181.2 million, for a valuation of $1.5 billion. Not bad for a startup that won’t enter its second year until early 2022.

It also seems to have avoided what could have been a kerfuffle with open sourcers with a decision to license the software that runs its SaaS platform with the Elastic License v2. This move came with the risk of generating a bit of unwanted controversy, since the license is one of a spate of licenses that have been created during the last five years or so that some are derisively calling “fauxpen source,” because they have some of the trappings of open source but are believed to not qualify for approval by the Open Source Initiative standards organization.

AirByte is in the unsexy business of marketing and building the connectors necessary for data pipelines that carry massive amounts of data from databases, data warehouses, and data lakes to locations that could include on-premises storage or cloud data warehouses such as Amazon Redshift, BigQuery, and Snowflake. These pipelines are becoming increasingly important, as data is being collected and stored everywhere and connectors, such as the ones AirByte markets, are what makes these pipelines possible.

“Airbyte solves two problems,” the company said in a statement back in May. “First, companies always have to build and maintain connectors on their own [and] most ‘long tail’ data connectors are orphans that are not supported by closed-source ELT technologies. Second, data teams often have to do custom work around pre-built connectors to make them work within their data architecture.”

ELT (or more commonly ETL, depending on the order in which data is being moved and processed) stands for “extract, load, transform,” which describes the data pipeline process.

“Open-source addresses both problems with a growing user community that supports more and more connectors, which can be shared and used widely,” Airbyte added.

AirByte on a Roll

The importance of these pipelines to large enterprises is made evident by the amount of money that AirByte has raised in 2021.

On Friday, the company announced that it had raised more than $150 million in a Series B funding round led by Altimeter Capital and Coatue Management, and also included investments from Thrive Capital, Salesforce Ventures, Benchmark, Accel, and SV Angel.

“Airbyte has already made a huge impact in a very short period of time and has more than 1,000 companies lined up to take advantage of its Airbyte Cloud data service that is starting to roll out,” Jamin Ball, a partner at Altimeter Capital said in a statement. “There is tremendous market momentum on top of Airbyte’s disruptive model to involve its users in building the ecosystem around its data integration platform.”

This round followed a $26 million Series A round that closed in May and was led by Benchmark, with participation from 8VC, Accel, SV Angel, and Y Combinator, as well as Shay Bannon, co-founder and CEO of Elastic, Auren Hoffman, co-founder of LiveRamp, and Dev Ittycheria, CEO of MongoDB. At that time, Chetan Puttagunta, a Benchmark general partner took a seat on Airbyte’s board of directors.

The Series A round was preceded by a $5.2 million seed round that closed in February, with investors that were led by Accel, followed by 8VC, and Y Combinator.

Harnessing Its Open Source Community

In March, shortly after the seed funding round, Airbyte’s chief operating officer and co-founder, John Lafleur, told me that the details of exactly how the startup would monetize the connectors they were giving away for free would have to wait until after the next funding round, which at the time he figured was about a year away.

“To be honest, we’re really focused on the open source part, so that means we won’t be focusing on monetization until the series A round in 2022,” he said.

At that time, a community of open source-minded developers had already formed around the company’s MIT licensed connectors — devs who were not only downloading and deploying the connectors, but who were contributing their own homegrown connectors back upstream to support the community.

The plan was to continue to build on that community and to grow the number of connectors available, both through development done in-house at Airbyte and from software contributed from the community. More importantly, a proprietary platform that could be used to ease the pain for enterprises with a large number of connectors to maintain was already being engineered and tested.

Eventually, the company planned for that platform to be available not only for enterprises to run on their own infrastructure — on-premises or in a hybrid environment — but as a managed SaaS service as well, something Lafluer said was much needed by enterprises wrangling massive amounts of data moving between diverse endpoints.

“I think the average in terms of number of connectors within companies is about 25, and the bigger you are the more connections you have internally,” he said. “Enterprises have hundreds of connectors, and you start to feel the pain of managing those connectors after seven, eight, or ten. At that point, you already have the trigger to use us for the management of the connectors, so we’re not really worried about monetization. It’s really about commoditizing data integration as our mission and solving this for a company.”

After that conversation, the startup’s fortunes accelerated rapidly on multiple fronts, with the successful negotiation of both A and B funding rounds more than a month earlier than either Lafleur or co-founder and CEO Michel Tricot had anticipated for a single round. The company has also moved forward quickly on the technology front, with its managed SaaS service beginning beta testing with live customers in October.

It’s also more clearly defined its business model, in a way that should meet approval from the open source community, which is important because these days enterprises tend to take an open-source-first position.

A few months back it announced that it would be operating under “a community-based participative model,” under which it will share revenues with “contributors of high-quality connectors,” and added that it expects 500 high-quality connectors to be available by the end of 2022 — meaning those third-party developers that have been contributing connectors upstream can now monetize them on Airbyte’s platform.

“With the rise of the modern data warehouses, our mission is to power all the organizations’ data movement and doesn’t end at ELT,” Tricot said in a statement issued last week. “By the end of 2022, we will cover more types of data movement, including reverse-ETL and streaming ingestion.”

Fauxpen Source Explained

In September the company announced that even though the connectors it offers will be open source (mostly, some outside vendors might opt to use a proprietary license), the core of its management platform would be covered under the Elastic License v2, a class of proprietary software license that proponents like to call “source available” but which is more commonly called “fauxpen.”

In almost all cases, these licenses were developed to be applied to previously open source databases or software-as-a-service platforms, as a way to keep cloud operators from using the software to run their own competing services. Generally, these licenses are based on existing OSI approved licenses, and grant users all of the benefits of the underlying open source license, but with caveats that prevent the software from being used as SaaS or integrated into proprietary software applications.

Probably the most well-known example is the Server Side Public License, which was created by MongoDB to replace the GNU Affero General Public License, an OSI approved license meant to handle the same SaaS issue, but which is widely considered to be ineffective.

This came about after Mongo became convinced that Amazon Web Services was using its database in a managed service. The license it created is basically the AGPL, but with the caveat that when the software is being made available as part of a service, the operator must release the source code for all applications being used to supply the service “such that a user could run an instance of the service using the Service Source Code you make available.”

OSI, whose seal of approval is generally considered necessary for open source licenses, is unlikely to approve any of the fauxpen licenses as open source, either because of restrictions on how the software can be used or because the license places restrictions on other software, with neither being allowed by the Open Source Definition.

The latter was the case with the SSPL, which Mongo submitted for OSI approval, but withdrew when it became obvious the license was not going to be approved in any way that would meet MongoDB’s needs.

Other examples of fauxpen source include MariaDB’s Business Source License, which it uses for its MaxScale database proxy and its Cluster Management API, and which requires payment when used in a commercial software-as-a-service environment but converts to open source after a specified time; the Cockroach Community License from Cockroach Labs, which much like the BSL requires the purchase of a license when used in a commercial SaaS offering; and the Elastic License, that also prohibits using the software in a commercial SaaS product. The latter license was developed by the Elastic search company, whose CEO is an Airbyte investor, and is the license Airbyte is now using to license its platform.

Why Go Fauxpen?

Although early alpha and beta releases of the Airbyte Cloud platform were under the open source MIT license, the company said from the beginning that the software would eventually be released under some kind of proprietary license, due to the same concerns others have about outside organizations using the software to run competing services.

The Elastic License works well for Airbyte’s purpose, since it wants enterprises and others to be able to download and use the software in-house, and even monetize it in ways other than SaaS.

In this way the license might be better considered as a “permissive” proprietary license, to borrow a word used in open source licensing. Companies can download it, install and run it in their data centers or clouds, modify the code in anyway they like, and even distribute the software to others. The one thing they can’t do is use it to directly compete with Airbyte.

That seems fair. You’re not supposed to bite the hand that feeds you. It also doesn’t make sense for open source advocates to complain about a proprietary license being too open.

Take our poll:

What do you think about vendors licensing their software under "fauxpen" proprietary licenses?

  • It's fine with me (32%, 14 Votes)
  • Depending on the circumstances, it could be okay (32%, 14 Votes)
  • Never! Fauxpen licenses are too easily confused with open source (30%, 13 Votes)
  • I don't know (7%, 3 Votes)

Total Voters: 44

Loading ... Loading ...

Breaking News: