Repobase
I'll go build my own Github. With blackjack, and hookers!
This is a living project but I have closed registration.
I’m going to dogfood it eventually and configure an OCI registry. I’m just frantically trying to re-align Oceania and Farkt while working insane hours and job hunting. It’s nuts.
I had the “O” art project smash 10GB of storage in a night and got cold feet. I might need to look into S3…
A long time ago, I thought Github was the coolest idea since sliced bread. I followed their companies news, revelled at their engineers overcoming challenges and generally loved the idea.
Before Github we had places like SourceForge. Publishing projects was not really streamlined or simple, it was chaotic, full of advertisements and there was a lot of scamming going on.
Github provided a place of peace and safety. I don’t really feel that way anymore.
This will be another long post because it’s another thing that I can’t talk about with other people. I’m just going to be sounding out my own thoughts here, but publishing this because someone, somewhere might find it interesting.
Acquisition
I don’t hate Microsoft as a company. They’ve been pretty smart about how they monetize such an important part of the world and they’ve been pretty good at balancing negatives and positives.
They do sell a packages experience and I think that all of the Mac vs Linux vs Windows hate focuses around the subjective opposed to the objective. Some arguments are rooting in objective fact, but most people just argue subjective points.
One thing that I didn’t like was how they handled the acquisition of Minecraft. There was a relatively peaceful co-existence between Craftbukkit and Mojang where Craftbukkit cheekily borrowed code from Mojang, and Mojang provided the base product towhich Bukkit built upon.
While I’m sure there is something to this that I failed to see, I saw Microsofts attack of Craftbukkit to be anti-competitive. I feel we saw why they tried to take down modded clients/servers when they brought out Bedrock, and I feel we’re constantly seeing Microsoft trying to suffocate Java.
I’m not trying to turn this into a hate piece about how Minecraft has evolved. In a lot of ways, Microsoft have made it the game that it is today and they’ve handled the acquisition with more grace than most other studios. My shining example of ‘good game dev’ practices is Factorio and Valheim, my example of ‘bad game dev’ practices is Valorant, and I consider Microsoft to sit somewhere in the middle, skewing closer to Factorio while barely being much higher than CS2.
When I heard that Microsoft was purchasing Github, I immediately nuked my accounts and cut off my use of their platform. I knew that Microsoft was going to:
- Provide just enough to keep us hooked
- Slowly make the platform an enterprise platform
- Integrate it into their ‘devops’/’cloud’ SaaS model
- Scrape the ever loving shit out of it
And all of this is fair game, which I guess we’ll discuss later, but it just rubbed me the wrong way because it felt like a safe space was being taken from us.
The analogy that I give is a public space being sold to a private entity. Yes the private entity allows us to use the space freely, but they slowly make the space worst by trying to profit from - rather than directly fund - it’s maintenance.
Ever since the acquisition I have waited for passionate developers to create “HitGub”, or some kind of stereoisomer of what Github was originally. Gitlab came close but haven’t strongly focused on the community aspect. CodeBerg is a good attempt but it kind-of feels like a hit-or-miss. Gitea has valid reasons for their decisions, but I somewhat agree with the Forgejo fork.
Branding
We should probably talk about this because two things are simultaneously true:
- The name of a business does not matter
- The name of a business causes it to die or thrive
Sharp, descriptive names for projects are important. “Forgejo” is a mouthful and a pain to type, where-as “Gitea” rolls off the tounge and fingers. “GitLab” is familar to those porting from GitHub.
“Forgejo” and “Codeberg” don’t really have marketability though. You need to imagine proposing this to an executive board when planing a change. If you say “Forgejo” it sounds like a clunky developer-run system that’s super engineer-focussed, where-as “Codeberg” sounds like they’re guiding the titanic towards its demise.
Companies were already on “GitHub” and the naming made sense because “Git” is a funny word, but they already moved from “SuBvErSiOn” to “Merc.. mer-coo… however it’s said.. hg”. “Git” is funny, witty, and already familiar. “Hub”; ah so it’s a central location for the git; coolio.
“GitLab” borrows from the familiarity of “GitHub”. We’ve already normamlized the use of the word “git” in countries that use it to describe an idiot. That association is now gone. “Lab” is just an alternative to “Hub”. Makes sense (from “normie” point of view).
I wanted to create a name that was:
- Descriptive, concise
- ‘Punchy’
- SFW
Part of “punchy” is that I’ve noticed a trend of two-syllable names performing well. Git-Hub, Git-Lab, Git-Tea, Git-Go, Az-Ure, Goo-Gle, etc. Codeberg is also a fun, albeit long, name that achieves the two-syllable goal.
Adding two syllabales did complicate making a name though. I did opt to go to three, which hurts how punchy it is for every day use, but there can be scope to change later.
I opted for “repobase”:
- Descriptive: a base (central place) for repos (repositories)
- Doesn’t restrict itself to Git; it could provide SVN and Hg should I have the motivation/time
- Suggests centrality (base)
- Neutral in meaning; safe for work but also board rooms
Sadly:
- Not punchy. Ree-Poh-Base. Walk around using it in sentences, you’ll want to say “rebase”, “rbase” “repo”, etc.
- .com not available. .net is fine though; it is a network of developers afterall.
- A little long, but thankfully it rolls off the fingers.
repobas- Suggests centrality (base)
- Neutral in meaning; safe for work but also board rooms
Sadly:
- Not punchy. Ree-Poh-Base. Walk around using it in sentences, you’ll want to say “rebase”, “rbase” “repo”, etc.
- .com not available. .net is fine though; it is a network of developers afterall.
- A little long, but thankfully it rolls off the fingers.
The main thing that I liked about it is that it rolls off the fingers. Typing repobase is relatively quick despite being longer, and doesn’t cause overlaps like codeberg or forgejo can. Because it’s two commonly used words for developers, it leverages existing muscle memory which is why I think it’s a little faster to type than Forgejo.
Gitea is a good example of this. You type “git” out of muscle memory, but the ea is the natural fall of your hand after hitting t.
I like the name. I think it works nicely for what I’m trying to do here. And I think I’m going to fight to keep it for as long as I can.
Mission
The next hardest part was balancing a mission with the monetization.
My personal motivation is to create a safe space that could be marketed towards free and open source projects. You can’t have a connected community when there’s a vested interest.
Some projects can’t or won’t associate themselves with a Microsoft platform. That’s their perogative and they’re entitled to their preferences. Other projects don’t care, but might want to exist in a central location for discoverability and accessibility for merge requests.
It’s not just Microsoft either; some places won’t want a Google association, or Amazon, or whatever. There needs to be a safe place - a middle ground - where you don’t need to consider your personal ethics when signing up.
So repobase is created for this.
The mission of Repobase is not to create the next git management platform. It’s not to create a business model around collaborative coding. It’s to provide a community around the code, and allow people to contribute to projects without having to involve their personal beliefs just to sign up.
Unfortunately, platforms like this need to eventually be monetized. The platform needs to get scaled globally to be accessible to Europe, China and the US for the most adoption. This costs a lot of money, especially in storage and transit. Sometimes this can run contrary to the mission, which sucks.
If compute and storage were so cheap that I didn’t have to care then maybe I’d just let it run for free with unlimited use. Unfortunately this would just not work and likely make me the Chris Poole of code collaboration.
So the mission needs to be:
- Community first
- But funded by something
- That something needs to be benign
For the mission:
- To provide a neutral space for communities to collaborate on projects
For the monetization:
- Provide an accessible base model with fair limitations
- Provide subscriptions for enhancing an experience or unlocking extra functionality
- Balance with whats available; don’t overuse microtransactions
- Sponsorships without selling data
Costing and Monetiation
Right off the bat I know that the following items will cost me the most money:
- Storage
- Transit
Everything else is relatively low effort. Compute is really cheap now so sticking to bare metal self hosted infrastructure is a cost effective base compared to cloud computing. Scaling shouldn’t be desperately required, and wouldn’t save me money at the start.
Storage is also actually very cheap, but redundancy is not. I’d need at least 2ru per compute to keep up with storage needs if this grows. I might also need to migrate from local storage to object storage to provide scalability.
Fortunately storage density is good. I can get servers with 3.5” sleds and 12-21TB disks. We can sacrifice on over-all performance because people, I’m sure, would be understanding of this so long as we don’t run out of room. We can help to supplement general load with SSD caching and ZFS read speed boosts. We can also keep footprint down by leveraging deduplication; built into ZFS.
Transit will be hard in the beginning but we can ease this relatively quickly be becoming an ISP. Once we have our own routing we can stop having to pay for transit and just focus on maintaining bandwidth.
These two items do give us some scope for what we’ll need to charge for.
Storage will be a huge thing. We can restrict repo sizes however people will get around this with submodules. We can reduce account and organization size but people will get around this the same way.
We can’t paywall things like container registries, packages, etc. because people will heavily leverage this feature.
For monetization I am considering the following:
- Paid Runner Hosting
- Save your cloud transit and compute costs by running on Repobases runners
- No free inclusions; security, safety, cost, etc.
- Paid Page Hosting
- It’s such a small thing, but I imagine people could see this as ‘fair’ if we keep it cheap.
- Say $1-2USD pcm, per project, per 1GiB storaged.
- Low cost to run but also low return. Hoping for scale to generate enough income to cover other aspects.
- We can make this nicer by not restricting SSG pipelines. Bring your custom plugins; fuck it.
- SSO Tax
- Pretty standard now.
- I want to provide a way for personal SSO (like self hosting Authentik) to log you into your account, but I think for business automations this should be kept to paying organizations.
- Enterprise Tier
- Included runner hosting
- Included pages hosting
- SSO unlocked
- No limits or restrictions on usage
Of course then I would have to neuter personal usage. It would have to be fair if I did, like some insane limit that the regular person or FOSS project would never hit, but something that a business would.
FOSS Projects would also need to be able to provide for a sponsorship or exemption. Some projects are insanely important to the internet and we should have an obligation to keep them around, but some get funding and support from elsewhere, and it becomes murky if we should just give away limited compute for them.
I know that sounds shitty but consider Apache. ASF has it’s own funding scheme; we would want them using our platform, but they’d also probably require a scale up just to onboard. That’s a very sizable donation for someone smaller than them.
Also consider Ubuntu. They’re a huge org with an enterprise offering. They could bring a lot of business to Repobase but Repobase would crumble under their strain.
So while I dream of keeping the space open for people to build and grow without hinderance, there has to be a ‘fair’ limit where we all agree that contributing towards the cost of running such a space is fair and equitable.
Features to Add
The fun part that I wish I was doing instead.
- Groups within accounts/organizations
- I hate not having groups like in Gitlab. I desperately want to add this.
- Personal SSO
- I’d like to auth into repobase using my home SSO server. I don’t know how I’d handle this but I’d love to figure that out.
- Enterprise SSO
- Provide Enterprise-configurable SSO into the Repobase platform.
- Company can opt into allowing developers to link their Repobase account with their company account. Developers can opt into linking if they want to.
- Allows work done at the company to reflect on the developers account, making their Repobase account a portfolio of their work.
- Hg integration
- I don’t know how challenging this would be but I would like to see if I can offer Hg ontop of Git. Most people that I know use Git however I do know that some companies - like Facebook - have found preference in Hg.
Forking Forgejo is going to be hard enough. Rewriting parts to introduce account limitations, enterprise SSO, subscription models will be very difficult.
I think my focus for the time being will be to address the following order:
- Get Forgejo fully forked and have all features in a Repobase namespace
- Create ‘group’ functionalities so that people can have a comparable experience to Gitlab
- Add in enterprise SSO and personal SSO, and add the account linking
- Test and refine. Re-address scaling. Look into taking on some people to help grow the idea.
Then we can start looking at deeper stuff.
- High priority on introducing fair restrictions.
- High priority on adding subscriptions managed within the platform.
- High priority on page hosting and runner hosting.
- Medium priority on enterpise offering.
- Hg integration; when I feel bored/motivated.
Essentially I’ll try and get it all working within Repobase. Then start building out from my Forgejo fork. Finally I would refine what I have and try and package two subscription tiers, with additional microtransactions for compute.
I think this is a fair approach. Idk.
Theory Crafting
I’m kind of bored of theory crafting. I might update this later but I just wanted to add a note here explaining what I’ve been doing as a side project and why I’ve been doing it.
I kind of hope to get in touch with someone who feels as passionately about this as I do. But I also know that I’ll need to do a fair bit of this work myself before anyones wanting to touch such a project.
I do feel bad that I cast shade on everyone else trying to achieve similar goals. The efforts of the Gitea, Forgejo and Codeberg communities are astonishing. What these develoeprs have created over the years is humbling and motivating. I feel really bad rebasing on Forgejo and not making my own platform, but I also know that I’ve tried and failed to create my own from scratch.
I’m hoping that this project will let me contribute code back to Forgejo, or at least provide some financial support. I have absolutely nothing to do with their space, but if this goes well then maybe I might.
In my head if this becomes big then I would say that Forgejo is the self-hostable Repobase, and donate massively to them or something along those lines. But I guess that’s a crappy promise when they’re trying to do the exact same with Codeberg.
Needless to say that Repobase isn’t my only project running. It’s very much a side project. I need to feed myself before I give much away for free if you know what I mean.