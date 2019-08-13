👩🏻💻 Senior software engineer 💜 Cybersecurity training, automation 🛡 Core maintainer OWASP WSTG
, like this:
gh-repos.txt
git@github.com:username/first-repository.git
git@github.com:username/second-repository.git
git@github.com:username/third-repository.git
xargs -n1 git clone < gh-repos.txt
<gh-repos.txt xargs -n1 git clone
, we use
gh-repos.txt
. The tool
xargs -n1
reads items from input and executes any commands it finds (it will
xargs
if it doesn’t find any). By default, it assumes that items are separated by spaces; new lines also works and makes our list easier to read. The flag
echo
tells
-n1
to use
xargs
argument, or in our case, one line, per command. We build our command with
1
, which
git clone
then executes for each line. Ta-da.
xargs
, but I don’t find this to be very convenient for using GitLab as a backup. As I work with my repositories in the future, I’d like to run one command that pushes to both GitHub and GitLab without additional effort on my part.
git push --set-upstream
cp gh-repos.txt gl-repos.txt
vim gl-repos.txt
:%s/\<github\>/gitlab/g
:wq
, which looks like:
gl-repos.txt
git@gitlab.com:username/first-repository.git
git@gitlab.com:username/second-repository.git
git@gitlab.com:username/third-repository.git
awk -F'\/|(\.git)' '{system("cd ~/FULL/PATH/" $2 " && git remote set-url origin --add " $0 " && git push")}' gl-repos.txt
should be the full path to the directory containing our GitHub repositories.
~/FULL/PATH/
.
master
file as input. With
gl-repos.txt
, it splits off the name of the directory containing the repository on our local machine, and uses these pieces of information to build our larger command. If we were to
awk
the output of
print
, we’d see:
awk
cd ~/FULL/PATH/first-repository && git remote set-url origin --add git@gitlab.com:username/first-repository.git && git push
cd ~/FULL/PATH/second-repository && git remote set-url origin --add git@gitlab.com:username/second-repository.git && git push
cd ~/FULL/PATH/third-repository && git remote set-url origin --add git@gitlab.com:username/third-repository.git && git push
awk
can split input based on field separators. The default separator is a whitespace character, but we can change this by passing the
awk
flag. Besides single characters, we can also use a regular expression field separator. Since our repository URLs have a set format, we can grab the repository names by asking for the substring between the slash character
-F
and the end of the URL,
/
.
.git
:
\/|(\.git)
is an escaped / character;
\/
means “or”, telling awk to match either expression;
|
is the capture group at the end of our URL that matches “.git”, with an escaped
(\.git)
character. This is a bit of a cheat, as “.git” isn’t strictly splitting anything (there’s nothing on the other side) but it’s an easy way for us to take this bit off.
.
where to split, we can grab the right substring with the field operator. We refer to our fields with a
awk
character, then by the field’s column number. In our example, we want the second field,
$
. Here’s what all the substrings look like:
$2
1: git@gitlab.com:username
2: first-repository
. To write the command, we just substitute the field operators for the repository name and URL. Running this with
$0
as we’re building it can help to make sure we’ve got all the spaces right.
print
awk -F'\/|(\.git)' '{print "cd ~/FULL/PATH/" $2 " && git remote set-url origin --add " $0 " && git push"}' gl-repos.txt
. By using this as the output of
system()
, each command will run as soon as it is built and output. The
awk
function creates a child process that executes our command, then returns once the command is completed. In plain English, this lets us perform the Git commands on each repository, one-by-one, without breaking from our main process in which
system()
is doing things with our input file. Here’s our final command again, all put together.
awk
awk -F'\/|(\.git)' '{system("cd ~/FULL/PATH/" $2 " && git remote set-url origin --add " $0 " && git push")}' gl-repos.txt
in one of our repository directories, we’ll see:
git remote -v
origin git@github.com:username/first-repository.git (fetch)
origin git@github.com:username/first-repository.git (push)
origin git@gitlab.com:username/first-repository.git (push)
without arguments will push the current branch to both remote repositories.
git push
will generally only try to pull from the remote repository you originally cloned from (the URL marked
git pull
in our example above). Pulling from multiple Git repositories at the same time is possible, but complicated, and beyond the scope of this post. Here’s an explanation of pushing and pulling to multiple remotes to help get you started, if you’re curious. The Git documentation on remotes may also be helpful.
(fetch)
and
xargs
can help to automate and alleviate a lot of tediousness in our work. However, there are some downsides.
awk
or
if
loops, and certainly more complicated to read. It’s likely that when we write them, we’ll miss a single quote or closing parenthesis somewhere; and as I hope this post demonstrates, they can take quite a bit of explaining, too. So why use them?
while