Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using SQLite instead of JSON could have worked too and would likely make the manifests smaller:

> curl -sI https://formulae.brew.sh/api/formula.json | grep content-length => 19898457 (~19.9MB)

> curl -sI https://formulae.brew.sh/api/cask.json | grep content-length => 4023930 (~4MB)

But the JSON API has been around for a while so brew 4.0 just makes use of it by default.

Brew 3.3 added an off-by-default HOMEBREW_INSTALL_FROM_API bool to let you install from brew's JSON-based API instead of checking out the large (and slow) homebrew/core and homebrew/taps repos: https://formulae.brew.sh/docs/api/

Brew 4 just makes that the default and deprecates the env var (you can use HOMEBREW_NO_INSTALL_FROM_API=1 if you still want to check out the git repos for some reason).



Homebrew actually passes `--compressed` here so it's smaller:

> curl -sI --compressed https://formulae.brew.sh/api/formula.json | grep content-length => 3482583 (~3.3MB)

> curl -sI --compressed https://formulae.brew.sh/api/cask.json | grep content-length => 722915 (~0.6MB)


Oh, nice, thanks for the correction!


> instead of checking out the large (and slow) homebrew/core and homebrew/taps repos

I actually wanted to ask why they don't just prune the history of those repos to cut down on their clone size? OT1H, I could actually imagine some software historian being curious what flags were required to build sqlite 3.6.20 from 2009 (872f50ac61d7) but OTOH building a whole new distribution system instead of effectively git-squash feels weird

As someone who regularly patches those local repos to work around silliness, I was bitten by that newfound API business and I'm thankful that env-var exists.

As someone who's seen The New Homebrew Way, I dread the day my HOMEBREW_NO_INSTALL_FROM_API gets taken away :-(


> OTOH building a whole new distribution system instead of effectively git-squash feels weird

Or even just `clone` with `--depth=1`? Just because the history exists doesn't mean you have to fetch it.


I believe they used to do `--depth` but GitHub complained because their systems weren't optimized for that and it ended being cheaper for the servers to send you the whole thing (much easier to figure out what refs you needed, or something along those lines).


I don't know anything about how git servers are implemented, but this is super weird to me. Why couldn't you basically just copy the files (excluding .git), then do `git init`, set the upstream remote, and fetch only the most recent commit? I'm guessing it wouldn't work _exactly_ like that because the git cli is convoluted and nobody remembers the exact semantics of every flag, but it seems surprising that there aren't any possible sequences of commands that do that. Even if they implemented it on server side by just making a "fake" client locally doing this, and then sending the tarball of the repo with just that commit, it would be way more efficient than making clients download the entire history.



I wasn't super sure how subsequent `git pull` works with a shallow clone, but if it works the same, then such an obvious fix makes the invention of that API solution even more painful


IIRC, the only thing that will potentially need to be changed is manually fetching the tags, since those won't be brought in. You can just do `git fetch --tags` to get all of the tag metadata though (without actually needing to pull the actual commits they reference; it just gives you the ability to look up a commit hash with a tag so that you can fetch it later).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: