Merging Git Repos for Archival Purposes

TL;DR I had reason to want to combine git repos into one big repo consisting of repos in their own folders, while ideally maintaining the histories of all those repos for archaeological purposes. There are many reasons why someone would want to do this, and my specific use case isn’t relevant. Good luck. Why so complicated? ‘Hidden’ files (dotfiles) suck Shell Wildcards suck Wildcards with selective exclusions (i.e. .git) suck File names with spaces suck Trailing Slashes suck Rewriting History sucks Raw version because I don’t trust GIST and embeds and such…. #!/bin/bash usage() { cat << EOF This script imports a git repo (accessible from https://\$origin/\$user/\$repo) and all its history as subdirectory of a destination (available locally at \$dest) It is designed for non-production, archival processes and may destroy everything you've ever loved because you looked at it funny. You have been warned. The structure of the destination will end up something like this: ~/src - \$dest - origins - \$origin - \$user - \$repo Required Arguments: -u|--user: The user that owns the repo to be imported -r|--repo: The name of the repository to be imported -d|--dest: The local name of the destination repository (assumed to be under ~/src) -o|--origin: The git server that is the origin of the repo to be imported EOF } if [ $# -le 6 ]; then usage exit 1 fi while [[ "$#" -gt 0 ]]; do case $1 in -u|--user) user="$2"; shift ;; -r|--repo) repo="$2"; shift ;; -d|--dest) dest="$2"; shift ;; -o|--origin) origin="$2"; shift ;; *) echo "Unknown parameter passed: $1"; usage; exit 1 ;; esac shift done tmp="/tmp/_${dest}_tmp" echo "Importing $origin/$user/$repo into $dest" rm -rf ~/src/$repo cd ~/src git clone https://$origin/$user/$repo cd $repo git filter-branch \ --tree-filter "mkdir -p $tmp/origin; git ls-files | cpio -pdumB $tmp/origin; git ls-files | xargs -d '\n' rm -r; find . -type d -empty -delete; mkdir -p origins/$origin/$user; mv $tmp/origin origins/$origin/$user/$repo/"\ --tag-name-filter cat --prune-empty -- --all if [ $? -eq 0 ]; then ## WAIT PATIENTLY cd ../$dest git remote add $repo ../$repo git fetch $repo --tags git merge --allow-unrelated-histories $repo/master #Youre on your own if you want a different / multiple branch(es)... git remote remove $repo else echo failed for $user/$repo fi

June 23, 2020 · Andrew Bolster

Mercurial to Git transfer; problems, and pitfalls.

Finally decided to move my research work across to GitHub; seems the ‘in’ thing to do. Also I wanted to get more into the Git swing of things and using intermediary tools like hg-git seem a bit contrived for a 1 person project. I’ve enjoyed using Bitbucket but it’s just not quite as polished. That and GH has better integration to pretty much everything… Sorry. Went through the process described here but it’s not really explained very well, so I’m adding my touch of idiot-proof magic. ...

April 26, 2013 · Andrew Bolster

Git Split Repository With Commit History

Thesis submitted, Viva cleared (with minor corrections) but this post isn’t about all that… Simple one; how do you go from one monolithic project repository to multiple respositories without losing all that tasty tasty commit history? #! /bin/zsh # # git_split.sh Current_Repo username new_repo {list of files/folders you want to keep} # Copyright (C) 2016 bolster <[email protected]> # # Distributed under terms of the MIT license. # BASEDIR=$1 INITDIR=`pwd` NEWREPO="[email protected]:$2/$3.git" shift 3 FILTER_ARGS=$@ TMPDIR=`mktemp -d -t ${BASEDIR}_XXXXXXXXX` echo $BASEDIR $TMPDIR $FILTER_ARGS cp -ra $BASEDIR/. $TMPDIR cd $TMPDIR git filter-branch --index-filter "git rm --cached -qr --ignore-unmatch -- . && git reset -q \$GIT_COMMIT -- $FILTER_ARGS" --prune-empty -- --all && git repack -a -d -f --depth=250 --window=250 && git remote set-url origin $NEWREPO && git gc && git push -u origin master ls -latrh cd $INITDIR rm -rf $TMPDIR YMMV, IANAGG, No refunds, Safety not guaranteed ...

Andrew Bolster