keeperbion.blogg.se

Git lfs vs git annex
Git lfs vs git annex











  1. GIT LFS VS GIT ANNEX CODE
  2. GIT LFS VS GIT ANNEX WINDOWS

It can take a very long time to sync content when the other repository is not guaranteed to have all the content it's supposed to have, since in that scenario the existence and checksum of each annexed file has to be checked.

git lfs vs git annex

The main downside is that you have to remember to sync (push) the git annex branch _and_ copy the annexed content, as well as pushing your main branch.

GIT LFS VS GIT ANNEX CODE

I use git annex in the git LFS-like fashion to store experimental data (microscope images, etc.) in the same repository as the code used to analyze it.

GIT LFS VS GIT ANNEX WINDOWS

(There is an alternate non-symlink mechanism for Windows that I don't use and know little about.) Git annex keeps a log of which host has (had) which file in a branch named "git annex". The symlink is staged in git to be committed and tracked by the usual git mechanisms. The content is managed by git annex and lives as a checksum-addressable blob in. The basic idea is that each file targeted by `git annex add` gets replaced by a symlink pointing to its content. The flexibility makes it a little difficult to get started. Or as a file repository as chubot suggested. But you can also use it with normal git commits, like you would Git LFS. ("Large" meaning larger than you'd want to directly commit to git.) If you're running Git Annex Assistant, it does pretty much work as basic file sync of a directory. Git annex is pretty flexible, more of a framework for storing large files in git than a basic sync utility. It's definitely not a cloud storage replacement now, but I guess my goal is to avoid cloud storage :) Although git annex is complementary to the cloud and has S3 and Glacier back ends, among many others. I think the sync algorithms would probably get slow at that number. But I would guess not all files meed a globally consistent version, so you can have multiple repos.Īlso df -inodes on my 4T and 8T drives shows 240 million, so you would most likely have to format a single drive in a special way. That said, 500 million is probably too many for one repo. 3.5 Ghz) was more like fifty times slower, not 5 times slower.

git lfs vs git annex

I remember measuring and a Rasperry Pi with 5x lower clock rate than an Intel CPU (700 Mhhz vs. If you're running on a Raspberry Pi, YMMV, but IME Raspberry Pi's are extremely slow at tasks like compiling CPython, so it wouldn't surprise me if they're also slow at running git. I'm using git annex for a repository with 100K+ files and it seems totally fine. This person said they put over a million files in a single git repo, and pushed it to Github, and that was in 2015. (Remember all the metadata is in git the data is in the "annex".) I only recently started using it, but I think most of the limitation on metadata is from git itself. There's also git LFS, which github supports, but git annex seems more truly distributed, which I like. I'm also pretty certain that git annex could support 3TB or 30TB repos if the file system has enough space.įor container images, I think you could simple store layers as files which will save some space for many versions. And again you don't have to check out everything eagerly. I think package repos are something like 300 GB, which should be easily manageable by git annex. I just do ls -L -l instead of ls -l to follow the symlinks. I don't love symlinks and neither does the author, but it seems to work fine. git/annex, and then it has algorithms for managing and syncing the big files. So what git annex does is simply store symlinks to big files inside. Github recommends that you keep repos less than 1 GB and definitely less than 5 GB, and they probably have a hard limit.

git lfs vs git annex

If you've ever tried that, then git will start to choke around a few gigabytes (I think it's the packing/diffing algorithms). Yeah it solves the problem of putting big binaries that don't compress well inside git.













Git lfs vs git annex