ETOOBUSY 🚀 minimal blogging for the impatient
jq
Notes and cheats about jq, in a kind-of cookbook form.
jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.
The notes below reflect my personal level of understanding; many actions
have surely better ways to be obtained, only I don’t know them and the
internet did not tell me before. You’re encouraged to share more with
flavio
[that-email-char]
polettix.it
.
SYNOPSIS
Pretty-print
$ jq <input.json # or, better to learn
$ jq <input.json .
Overview
jq is powerful, but anybody reading this page already knows about
that. I can only tell that I often use it together with Romeo, which
has a json2csv
sub-command to generate CSV, as well as a csv2json
to
feed data in jq starting from a CSV file.
Command-line Arguments
Most of the times I use it in a pipe in the command-line to just have a pretty-print:
$ printf '{"top":"foo","in":[1,2,3]}' | jq
{
"top": "foo",
"in": [
1,
2,
3
]
}
There will be color in your terminal but I’m too lazy to fiddle with it here in this page.
To do anything interesting, like getting only the value of top
, we
have to add a filter:
$ printf '{"top":"foo","in":[1,2,3]}' | jq .top
"foo"
If we have some files to work on, they can be provided on the command line after the filter. It will happily work on each of them:
$ printf '{"top":"foo","in":[1,2,3]}' > input.json
$ jq .top input.json input.json
"top"
"top"
What’s with those double quotes? jq tries hard to return valid JSON,
so they are JSON strings. If we only care about the value, we can ask
for raw data with command-line option -r
:
$ printf '{"top":"foo","in":[1,2,3]}' | jq -r .top
foo
Having many input files will not yield valid JSON output, because each of them is a separate processing:
$ printf '{"top":"foo","bar":"baz"}' > input.json
$ jq . input.json input.json
{
"top": "foo",
"bar": "baz"
}
{
"top": "foo",
"bar": "baz"
}
It’s possible to put all inputs in a wrapping array with command-line
option -s
:
$ printf '{"top":"foo","bar":"baz"}' > input.json
$ jq -s . input.json input.json
[
{
"top": "foo",
"bar": "baz"
},
{
"top": "foo",
"bar": "baz"
}
]
Slicing
Adding objects
When we add two objects with +
, we get back an object with all
key/value pairs from both addends:
$ printf '{"foo":1}' | jq '. + {bar:2}'
{
"foo": 1,
"bar": 2
}
The last object wins if keys overlap:
$ printf '{"foo":1}' | jq '. + {bar:2, foo:3}'
{
"foo": 3,
"bar": 2
}
This can be applied also to implicit iterations over arrays:
$ printf '[{"foo":1},{"foo":2}]' | jq '.[] + {bar:7}'
{
"foo": 1,
"bar": 7
}
{
"foo": 2,
"bar": 7
}
The iteration “unpacks” the array, so we need to wrap this with square brackets to get the array back:
$ printf '[{"foo":1},{"foo":2}]' | jq '[.[] + {bar:7}]'
[
{
"foo": 1,
"bar": 7
},
{
"foo": 2,
"bar": 7
}
]
This is not limited to extraction of data, as we can use the +=
operator to enrich a sub-array of objects:
$ jq <input.json .
{
"foo": [
{
"baz": 1
},
{
"baz": 13
}
]
}
$ jq <input.json '.foo[] += {bar:2}'
{
"foo": [
{
"baz": 1,
"bar": 2
},
{
"baz": 13,
"bar": 2
}
]
}
This is not limited to adding fixed stuff, e.g. we can get the key/value to add from another part of the upper object:
$ jq <input.json .
{
"bar": 2,
"foo": [
{
"baz": 1
},
{
"baz": 13
}
]
}
$ jq <input.json '.foo[] += {bar}'
{
"bar": 2,
"foo": [
{
"baz": 1,
"bar": 2
},
{
"baz": 13,
"bar": 2
}
]
}
This is by no means limited to adding a single key/value pair:
$ jq <input.json .
{
"bar": {
"galook": 9,
"aargh": 0
},
"foo": [
{
"baz": 1
},
{
"baz": 13
}
]
}
$ jq <input.json '.foo[] += .bar'
{
"bar": {
"galook": 9,
"aargh": 0
},
"foo": [
{
"baz": 1,
"galook": 9,
"aargh": 0
},
{
"baz": 13,
"galook": 9,
"aargh": 0
}
]
}
This can be applied looping over a higher-level array of objects:
$ jq <input.json .
[
{
"bar": {
"galook": 9,
"aargh": 0
},
"foo": [
{
"baz": 1
},
{
"baz": 13
}
]
},
{
"bar": {
"galook": 99,
"aargh": 90
},
"foo": [
{
"baz": 91
},
{
"baz": 913
}
]
}
]
$ jq <input.json '.[] |= (.foo[] += .bar)'
[
{
"bar": {
"galook": 9,
"aargh": 0
},
"foo": [
{
"baz": 1,
"galook": 9,
"aargh": 0
},
{
"baz": 13,
"galook": 9,
"aargh": 0
}
]
},
{
"bar": {
"galook": 99,
"aargh": 90
},
"foo": [
{
"baz": 91,
"galook": 99,
"aargh": 90
},
{
"baz": 913,
"galook": 99,
"aargh": 90
}
]
}
]
How can we use this? As an example, we might have a JSON representing
groups in a LDAP directory, like the following, and we want to generate
a CSV with columns cn
and member
and the “right” values:
$ jq <from-ldap.json .
[
{
"cn": "odd",
"member": [
"CN=three,OU=numbers,DC=whatever",
"CN=seven,OU=numbers,DC=whatever",
"CN=eleven,OU=numbers,DC=whatever",
"CN=nineteen,OU=numbers,DC=whatever"
]
},
{
"cn": "even",
"member": [
"CN=two,OU=numbers,DC=whatever",
"CN=four,OU=numbers,DC=whatever",
"CN=twelve,OU=numbers,DC=whatever"
]
}
]
First of all, we turn each string in the member
array to an object, so
that we are prepared to put more data:
$ jq <from-ldap.json '.
| (.[] |= (.member[] |= {member:.}))'
[
{
"cn": "odd",
"member": [
{
"member": "CN=three,OU=numbers,DC=whatever"
},
{
"member": "CN=seven,OU=numbers,DC=whatever"
},
{
"member": "CN=eleven,OU=numbers,DC=whatever"
},
{
"member": "CN=nineteen,OU=numbers,DC=whatever"
}
]
},
{
"cn": "even",
"member": [
{
"member": "CN=two,OU=numbers,DC=whatever"
},
{
"member": "CN=four,OU=numbers,DC=whatever"
},
{
"member": "CN=twelve,OU=numbers,DC=whatever"
}
]
}
]
Now we can apply what we learned in this section and add the cn
:
$ jq <from-ldap.json '.
| (.[] |= (.member[] |= {member:.}))
| (.[] |= (.member[] += {cn}))'
[
{
"cn": "odd",
"member": [
{
"member": "CN=three,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=seven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=eleven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=nineteen,OU=numbers,DC=whatever",
"cn": "odd"
}
]
},
{
"cn": "even",
"member": [
{
"member": "CN=two,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=four,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=twelve,OU=numbers,DC=whatever",
"cn": "even"
}
]
}
]
Time to get only the member
sub-arrays:
$ jq <from-ldap.json '.
| (.[] |= (.member[] |= {member:.}))
| (.[] |= (.member[] += {cn}))
| (.[] |= .member)'
[
[
{
"member": "CN=three,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=seven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=eleven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=nineteen,OU=numbers,DC=whatever",
"cn": "odd"
}
],
[
{
"member": "CN=two,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=four,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=twelve,OU=numbers,DC=whatever",
"cn": "even"
}
]
]
Now we can add
this array and get a single, flattened array:
$ jq <from-ldap.json '.
| (.[] |= (.member[] |= {member:.}))
| (.[] |= (.member[] += {cn}))
| (.[] |= .member)
| add'
[
{
"member": "CN=three,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=seven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=eleven,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=nineteen,OU=numbers,DC=whatever",
"cn": "odd"
},
{
"member": "CN=two,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=four,OU=numbers,DC=whatever",
"cn": "even"
},
{
"member": "CN=twelve,OU=numbers,DC=whatever",
"cn": "even"
}
]
This can be fed into romeo and finally get our CSV:
$ jq <from-ldap.json '.
| (.[] |= (.member[] |= {member:.}))
| (.[] |= (.member[] += {cn}))
| (.[] |= .member)
| add' \
| romeo json2csv
cn;member
odd;CN=three,OU=numbers,DC=whatever
odd;CN=seven,OU=numbers,DC=whatever
odd;CN=eleven,OU=numbers,DC=whatever
odd;CN=nineteen,OU=numbers,DC=whatever
even;CN=two,OU=numbers,DC=whatever
even;CN=four,OU=numbers,DC=whatever
even;CN=twelve,OU=numbers,DC=whatever
This was a demonstration for a very generic situation; in this case, we don’t need to keep the whole structure along the way, and can benefit from some implicit flattening happening automatically as we unfold arrays:
$ jq <from-ldap.json '.[] | (.member[] | {member:.}) + {cn}'
{
"member": "CN=three,OU=numbers,DC=whatever",
"cn": "odd"
}
{
"member": "CN=seven,OU=numbers,DC=whatever",
"cn": "odd"
}
{
"member": "CN=eleven,OU=numbers,DC=whatever",
"cn": "odd"
}
{
"member": "CN=nineteen,OU=numbers,DC=whatever",
"cn": "odd"
}
{
"member": "CN=two,OU=numbers,DC=whatever",
"cn": "even"
}
{
"member": "CN=four,OU=numbers,DC=whatever",
"cn": "even"
}
{
"member": "CN=twelve,OU=numbers,DC=whatever",
"cn": "even"
}
At this point, we just need to wrap the whole thing inside an array, so that the end result is valid JSON that can be fed into romeo:
$ jq <from-ldap.json '[.[] | (.member[] | {member:.}) + {cn}]' \
| romeo json2csv
cn;member
odd;CN=three,OU=numbers,DC=whatever
odd;CN=seven,OU=numbers,DC=whatever
odd;CN=eleven,OU=numbers,DC=whatever
odd;CN=nineteen,OU=numbers,DC=whatever
even;CN=two,OU=numbers,DC=whatever
even;CN=four,OU=numbers,DC=whatever
even;CN=twelve,OU=numbers,DC=whatever
Normalizing
Sometimes we have a key-value pair in an object which might be missing, or having the value as a single string, or where the value is an array.
# no "bar" key/value pair
$ printf '{"top":"foo"}' > input-none.json
# value for "bar" is a string (/scalar)
$ printf '{"top":"foo","bar":"baz"}' > input-string.json
# value for "bar" is an array
$ printf '{"top":"foo","bar":["baz","galook"]}' > input-array.json
When we want to operate on bar
as an array, we’re likely to hit a wall
in the other two cases:
# All OK
$ jq '.bar[]' input-array.json
"baz"
"galook"
# Errors
$ jq '.bar[]' input-none.json
jq: error (at input-none.json:0): Cannot iterate over null (null)
$ jq '.bar[]' input-string.json
jq: error (at input-string.json:0): Cannot iterate over string ("baz")
jq has a if
/then
/else
/elif
end
construct, together with a
type
operator, which help normalizing the data.
$ filter='.bar |
if type == "array" then .
elif type != "null" then [.]
else []
end'
$ jq "$filter" input-array.json
[
"baz",
"galook"
]
$ jq "$filter" input-string.json
[
"baz"
]
$ jq "$filter" input-none.json
[]
Caveat: the else
is always needed. Well, unless you have a recent
version, of course.
Useful links
The following pages can help a lot: