← Back to news

I found 10k GitHub repositories distributing Trojan malware

orchidfiles.com|901 points|237 comments|by theorchid|Jun 18, 2026

Uncovering 10,000 Trojan-Distributing Repositories on GitHub

Date: 18 June 2026
Author's Note: This is a detailed account of how I identified a massive malware campaign leveraging GitHub to spread Trojans.

I stumbled upon a large-scale operation where approximately 10,000 repositories were used to distribute malicious software. Interestingly, these repositories weren't forks; they were created by distinct contributors with unique names, making them appear unrelated at first glance. However, they all adhered to a specific behavioral signature, which allowed me to automate their discovery.


🔍 The Initial Discovery

The journey began when I was checking the SEO indexing of one of my own GitHub projects.

  1. Google Search: My actual repository appeared as expected.
  2. Bing Search: A different repository appeared. It had the exact same name and description as mine.

Upon investigation, I found it was a mirror of my project, containing every single one of my commits—and it even listed me as a contributor. However, a new commit had been pushed recently that modified the README.md file.

While browsing tags for another project, I noticed a similar occurrence: another repository mirroring a project's content, but with a recently added link to a .zip archive in the readme.

The Behavioral Loop

By monitoring these accounts, I noticed a strange cycle:

Every few hours, the attackers would delete the most recent commit and push the exact same one again. The only change was the inclusion of a link to a zip archive in the README.md.

The Struggle with Support

I attempted to report this to GitHub, but the experience was frustrating:

  • GitHub Support: No response for two weeks.
  • AI Assistants: Provided no actionable advice.
  • Community Threads: I received nothing but helpful tips "AI slop" from three different users.

Eventually, after a month, GitHub confirmed via email that the specific repositories I reported had been removed.


📦 Anatomy of the Malware

I noticed several other repositories following this pattern (e.g., lucasheriq4374/welink, lucioloprey/OcyShield-Framework, and luigi1973/AssetRipper-CLI). The linked zip archives consistently contained the following file structure:

File CategoryPossible Filenames
Execution ScriptApplication.cmd or Launcher.cmd
Main Payloadloader.exe, luajit.exe, or other random .exe names
Configuration/Datarandom_name.cso or random_name.txt
Dependencylua51.dll

Detection Nuance:

  • If you submit the URL of the archive to VirusTotal \rightarrow 0 detections.
  • If you upload the actual .zip file \rightarrow Trojan detected.

🛠️ Engineering the Detection Script

My subconscious kept dwelling on this. I realized I could find every single one of these repositories if I could define a mathematical and behavioral pattern.

The Search Criteria

To identify these repos, the script needed to find targets that met these conditions:

  • Frequent deletion and re-pushing of commits.
  • Only the README.md is modified in the latest commit.
  • The README.md contains a link to a .zip file.
  • Commits are mirrored from another source.
  • The repo is a standalone entity, not a fork.
  • Diverse contributors and repository names.

The Technical Workflow

Analyzing every single GitHub repo would take forever. Instead, I used gharchive to isolate high-activity repositories.

The Data Scale: Over a 5-day window, there were approximately 1.6×1071.6 \times 10^7 (16 million) commit pushes. Filtering for those updated every few hours narrowed this down to just 3,000 repositories.

Refining the Filter

Initially, I applied strict filters to remove noise:

  1. user != bot
  2. time_since_previous_commit > 1 month
  3. contributor_count > 1

This left me with only 14 repositories. I was disappointed, thinking the campaign was smaller than I imagined.


💡 The "Aha!" Moment

Just as I was about to write the article based on those 14 results, I performed a manual double-check. I noticed something critical: all of them had been updated exactly 20 hours ago.

I realized my "updated every few hours" filter was too aggressive and was discarding repositories that updated less frequently. Furthermore, I found repos where the latest commit showed "zero changes" but still existed, and almost all of them shared the exact commit message: Update README.md.

Final Adjustment

I modified the script to look for repositories updated between 11 and 2424 times every 24 hours.

The Result: The count jumped from 14 to 10,000.