I'm currently working with a project that contains many small git repos. Many of them are small utility scripts written in Golang and Bash. Luckily, the internal git server has a very simple API I can use to fetch all the project names.

The payload looks similar to the following. I have taken out a lot of nonsense and wrapped it in a 'json' variable for this post.

json='
{
   "limit":25,
   "isLastPage":false,
   "values":[
      {
         "name":"project-name1",
         "links":{
            "clone":[
               {
                  "href":"ssh://git@someip:port/project1.git",
                  "name":"ssh"
               },
               {
                  "href":"http://someip:port/project1.git",
                  "name":"http"
               }
            ],
            "self":[
               {
                  "href":"http://someip/project1"
               }
            ]
         }
      },
      {
         "name":"project-name2",
         "links":{
            "clone":[
               {
                  "href":"ssh://git@someip:port/project2.git",
                  "name":"ssh"
               },
               {
                  "href":"http://someip:port/project2.git",
                  "name":"http"
               }
            ],
            "self":[
               {
                  "href":"http://someip/project2"
               }
            ]
         }
      }
   ]
}
'

The payload contains two projects with both an http and ssh link destinations. What I want to do is iterate over this once, grab all the information I need and clone the repositories.

I'm also going to use Bash, as it rocks. Unlike other languages, Bash doesn't have a built in JSON processor like Go. Parsing json isn't so easy with bash, however there is a processor we can install that makes it much easier.

jq is a "lightweight and flexible command-line JSON processor"

After reading the documentation, I came up with the following expression

repos=$(echo $json | jq '.values[] | select(.links.clone[].name == "ssh") | {repoName: .name, repoType: .links.clone[].name, repoHref: .links.clone[].href}')

In short, im trying to extract the array of 'values' and the array of 'clone', apply a filter and fetch only the 'ssh' links, a long with the name of the project... after that I want o assign the values from the clone and it's parent to the following variables

  • 'repoName' - values[].name
  • 'repoType' - values[].clone[].name
  • 'repoHref' - values[].clone[].href

This looks pretty straight forward, lets print out the results

echo $repos | jq

{
  "repoName": "project-name1",
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project1.git"
}
{
  "repoName": "project-name1",
  "repoType": "ssh",
  "repoHref": "http://someip:port/project1.git"
}
{
  "repoName": "project-name1",
  "repoType": "http",
  "repoHref": "ssh://git@someip:port/project1.git"
}
{
  "repoName": "project-name1",
  "repoType": "http",
  "repoHref": "http://someip:port/project1.git"
}
{
  "repoName": "project-name2",
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project2.git"
}
{
  "repoName": "project-name2",
  "repoType": "ssh",
  "repoHref": "http://someip:port/project2.git"
}
{
  "repoName": "project-name2",
  "repoType": "http",
  "repoHref": "ssh://git@someip:port/project2.git"
}
{
  "repoName": "project-name2",
  "repoType": "http",
  "repoHref": "http://someip:port/project2.git"
}

hmmm.... not what i'm expecting... if you look at the results, looks like it's multiplying the arrays and returning all combinations of the result... Actually getting some strange results where the repo type is http but the href is the ssh link. Maybe this isn't as easy as I thought.

After some tweaking and a few games of Tetris ( gameboy version on the switch ), I filtered till i got to the 'clone' array. This looks more promising however

 repos=$(echo $json | jq '.values[] | .links.clone[] | select(.name == "ssh") | {repoName: .parent.name, repoType: .name, repoHref: .href}')

 echo $repos | jq

{
  "repoName": null,
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project1.git"
}
{
  "repoName": null,
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project2.git"
}

Notice that the parent item 'repoName' is null. Obviously I am not capturing the parent element properly.

Looks like you are able to capture a variable, We will do this to keep a reference of the parent in our filtered result.

Lets skip to the working solution...

repos=$(echo $json | jq '.values[] | . as $value | .links.clone[] | select(.name == "ssh") | { repoName: $value.name, repoType: .name, repoHref: .href }')

echo $repos | jq

{
  "repoName": "project-name1",
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project1.git"
}
{
  "repoName": "project-name2",
  "repoType": "ssh",
  "repoHref": "ssh://git@someip:port/project2.git"
}

Perfect... I am able to capture all the data I need.

Here is what the expressions do block by block

  • . as $value: stores the current value in the array in a variable named $value for later use.

  • | .links.clone[]: extracts an array of clone links from the JSON data.

  • | select(.name == "ssh"): filters the array of clone links so that only the links with a name field equal to "ssh" are selected.

  • The rest just takes the data and puts it into our newly constructed json array

Now that I am able to clone all the repos, time to merge these into a mono repo and play more Tetris


You must log in to comment.