Application Security , Incident & Breach Response , Managed Detection & Response (MDR)

Apache Airflow Leak Exposes 'Thousands' of Credentials

Intezer: Slack, PayPal and AWS Among Firms Affected
Apache Airflow Leak Exposes 'Thousands' of Credentials
Platforms exposed in Apache Airflow credential leak (Source: Intezer)

Researchers at cybersecurity firm Intezer have uncovered a number of unprotected instances in workflow platform Apache Airflow that they say have exposed sensitive information belonging to several companies.

See Also: The Essential Guide to Securing Remote Access

The "thousands" of exposed credentials belong to companies in the cybersecurity, media, finance, manufacturing, information technology, biotech, e-commerce, health, energy and transportation industries, according to researchers Nicole Fishbein and Ryan Robinson, who found the vulnerabilities.

Affected companies also include communication platform Slack, fintech firm PayPal and cloud services provider AWS, they say.

These companies - and others - use Apache Airflow, an open-source tool, to programmatically author, schedule and monitor workflows. It is used by data engineers to orchestrate workflows or pipelines.

Impact of the Vulnerabilities

Most of the exposed Airflow instances, according to the researchers, revealed information about the services and platforms that companies use in their software development environments. Some of the instances, they add, included private names of Docker images or internal dependencies used in the workflow.

"Exposing information about tools and packages used in the organization’s infrastructure can jeopardize the organization and also be leveraged by threat actors in supply chain attacks," the researchers note. "This can lead to attacks that leverage dependency or image short names to deliver malicious code instead of the intended code."

Jake Williams, CTO of cybersecurity firm BreachQuest, says the leak is "extremely significant." Unlike more traditional credential leaks that affect individual user accounts, the Apache Airflow leaks affect entire application framework instances, says Williams, who is a former member of the U.S. National Security Agency's elite hacking team.

"Threat actors might use leaked credentials to compromise entire databases containing sensitive user content. In some cases, they may be able to use the credentials to compromise entire application containers and/or run their own containers using a victim’s billing information," he tells Information Security Media Group.

In short, while user information wasn’t directly compromised through these leaks, they open the door to compromises of user data in massive quantities, Williams adds.

Possible Legal Action

Exposing customer information violates data protection laws and may lead to legal action, the researchers say.

The General Data Protection Regulation, for example, applies to organizations handling data of European citizens or entities in the European Union. Information leaks can violate this law and result in administrative fines, the researchers add.

"Disruption of clients’ operations through poor cybersecurity practices can also result in legal action such as class action lawsuits. In the May Colonial Pipeline hack, consumers that had their business disrupted filed class action lawsuits against it," the researchers note.

Insecure Coding Practices

The Intezer researchers say insecure coding practices likely led to the vulnerabilities. They discovered several instances with hard-coded passwords inside the Python Directed Acyclic Graph code, the report says.

"Passwords should not be hard-coded and the long names of images and dependencies should be utilized. You will not be protected when using poor coding practices even if you believe the application is firewalled off to the internet," the researchers note. "The configuration file (airflow.cfg) is created when Airflow is first started. It contains Airflow’s configuration and it is able to be changed."

The configuration file, they say, may comprise passwords and keys, and if the setting in the file "expose_config" is set to 'true,' "anyone can access the configuration from the web server UI, and accessing from the UI can expose credentials."

Hank Schless, senior manager of security solutions at mobile security provider Lookout, says that even a simple misconfiguration in cloud services and apps could be the "backstage pass that an attacker needs to access the entire infrastructure".

"Attackers are constantly crawling the internet to find misconfigured or unsecured services that they can easily access. One misconfigured service could give an attacker all they need to move laterally throughout the entire infrastructure - especially in large complex infrastructures where the attacker can move quietly without setting off any alarm bells," Schless notes.

Credentials in Airflow can also be leaked through a feature called Variables, in which it is common to see hard-coded passwords, the researchers say. The company website defines Variables as a "generic way to store and retrieve arbitrary content or settings as a simple key value store within Airflow."

Other Malicious Possibilities

According to the researchers, Airflow plug-ins or features may also be abused to run malicious code. Any visiting user could edit Variables and thus inject malicious code, they add.

"One entity we observed was using Variables to store internal container image names to execute. These container image variables could be edited and swapped out with an image containing and running unauthorized or malicious code," the researchers say.

Unofficial third-party plug-ins may also enable threat actors to execute malicious code, they add.

A plug-in called airflow-code-editor was also used - although sparingly - to "edit DAGs to include malicious code that can then be triggered from the UI," the researchers say.


The researchers advise users to update their Airflow software to version 2, which includes changes such as the removal of a "dangerous" ad hoc query from the GUI. The new version also enforces login and authentication for all operations in the REST API, while the configuration file is stricter and "requires explicit specifications of configuration values rather than using default values," the researchers say.

The logs in version 2, they add, do not leak information, and the dashboard includes a security tab providing information about users and the permissions they have.

The researchers also advise following secure coding practices, adding that "passwords should not be hard-coded and the long names of images and dependencies should be utilized."

"You will not be protected when using poor coding practices even if you believe the application is firewalled off to the internet," they say.

About the Author

Prajeet Nair

Prajeet Nair

Assistant Editor, Global News Desk, ISMG

Nair previously worked at TechCircle, IDG, Times Group and other publications, where he reported on developments in enterprise technology, digital transformation and other issues.

Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing, you agree to our use of cookies.