rwiggins 6 hours ago

Oh, fantastic. jq has become an integral part of work for me.

I'll use this opportunity to plug the one-liner I use all the time, which summarizes the "structure" of a doc in a jq-able way: https://github.com/stedolan/jq/issues/243#issuecomment-48470... (I didn't write it, I'm just a happy user)

For example:

    $ curl -s 'https://ip-ranges.amazonaws.com/ip-ranges.json' | jq -r '[path(..)|map(if type=="number" then "[]" else tostring end)|join(".")|split(".[]")|join("[]")]|unique|map("."+.)|.[]'
    .
    .createDate
    .ipv6_prefixes
    .ipv6_prefixes[]
    .ipv6_prefixes[].ipv6_prefix
    .ipv6_prefixes[].network_border_group
    .ipv6_prefixes[].region
    .ipv6_prefixes[].service
    .prefixes
    .prefixes[]
    .prefixes[].ip_prefix
    .prefixes[].network_border_group
    .prefixes[].region
    .prefixes[].service
    .syncToken
(except I have it aliased to "jq-structure" locally of course. also, if there's a new fancy way to do this, I'm all ears; I've been using this alias for like... almost a decade now :/)

In the spirit of trying out jqfmt, let's see how it formats that one-liner...

    ~  echo '[path(..)|map(if type=="number" then "[]" else tostring end)|join(".")|split(".[]")|join("[]")]|unique|map("."+.)|.[]' | ~/go/bin/jqfmt -ob -ar -op pipe
    [
        path(..) | 
        map(if type == "number" then "[]" else tostring end) | 
        join(".") | 
        split(".[]") | 
        join("[]")
    ] | 
        unique | 
        map("." + .) | 
        .[]%
    ~  
Not bad! Shame that jqfmt doesn't output a newline at the end, though. The errant `%` is zsh's partial line marker. Also, `-ob -ar -op pipe` seems like a pretty good set of defaults to me - I would prefer that over it (seemingly?) not doing anything with no flags. (At least for this sample snippet.)
  • naniwaduni 3 hours ago

    For small problem sizes, you can get a nontrivial improvement by moving the unique up ahead of all the string manipulation:

        jq -r '[path(..)|map(if type=="number" then "[]" end)]|unique[]|join(".")/".[]"|"."+join("[]")'
    
    For larger problem sizes, you might enjoy this approach to avoid generating the array of all paths as an intermediate, instead producing a deduped shadow structure as you go along:

        jq -rn --stream 'reduce (inputs|select(.[1])[0]|map(if type=="number" then "[]" end)) as $_ (.; setpath($_; 1))|path(..)|join(".")/".[]"|"."+join("[]")'
    
    (Note that in either case, you still run yourself into a bit of trouble with fields named "[]", as well as field names with "." in them. I assume this is not a serious issue, since you're only ever looking at this interactively.)
  • petercooper 6 hours ago

    Not anywhere near as sophisticated as yours but I have something vaguely similar for simplifying JSON documents (while maintaining what the data also looks like) for feeding to LLMs to help them code against:

        jq 'walk(if type == "array" then (if length > 0 then [.[0]] else . end) else . end)'
    
    So that 70,000+ line Amazon example of yours would boil down to:

        {
          "syncToken": "1753114994",
          "createDate": "2025-07-21-16-23-14",
          "prefixes": [
            {
              "ip_prefix": "3.4.12.4/32",
              "region": "eu-west-1",
              "service": "AMAZON",
              "network_border_group": "eu-west-1"
            }
          ],
          "ipv6_prefixes": [
            {
              "ipv6_prefix": "2600:1f69:7400::/40",
              "region": "mx-central-1",
              "service": "AMAZON",
              "network_border_group": "mx-central-1"
            }
          ]
        }
    
    .. which is easier/cheaper to feed to an LLM for getting it to write code to process, etc. than the multi-megabyte original.
    • rwiggins 5 hours ago

      Oh wow, that's fantastic. I love that it includes real values while still summarizing the doc's structure. I'm going to steal that. I'll probably keep jq-structure around because it's so easy to copy/paste paths I'm looking for, but yours is definitely better for understanding what the JSON doc actually contains.

    • naniwaduni 2 hours ago

      Got a bit nerd-sniped here, but first of all we can reduce if A then B else . end === if A then B end since jq 1.7:

          jq 'walk(if type == "array" then (if length > 0 then [.[0]] end) end)'
      
      Now we could contract those conditionals:

          jq 'walk(if type == "array" and length > 0 then [.[0]] end)'
      
      but it turns out we can even more usefully express if length > 0 then [.[0]] end === [limit(1; .[])] == .[:1]:

          jq 'walk(if type == "array" then .[:1] end)'
      
      From here, we can golf it a little further (this is kind of a generic type-matching pattern):

          jq 'walk(arrays[:1] // .)'
      
      although this does incur a bit more overhead than checking type directly.

      Speaking of overhead, though, it turns out that the implementation of walk/1 (https://github.com/jqlang/jq/blob/master/src/builtin.jq#L212) will actually run the filter on every element of an array, even though we're about to throw most of them out, which we can eliminate by writing the recursion explicitly:

          jq 'def w: if type=="array" then [limit(1; .[]|w)] elif type=="object" then .[] |= w end; w'
      
      which gets the operation down from ~200 ms on my machine (not long enough to really get distracted, but enough to feel the wait) to a perceptually instant ~40 ms (which is mostly just the cost of reading the input). Now we can golf it down a little more:

          jq 'def w: if type=="array" then [limit(1; .[]|w)] else objects[] |= w end; w'
          jq 'def w: (arrays[:1]|map(w)) // (objects[] |= w); w'
      
      (the precedence here actually allows us to eliminate the parens here...)

          jq 'def w: arrays |= .[:1]|iterables[] |= w; w'
      
      And, inaccessibility of the syntax aside, I think this does an incredible job of expressing the essence of what we're trying to do: we trim any array down to its first element, and then recursively apply the same transformation throughout the structure. jq is a very expressive language, it just looks like line noise...
      • Bluestein 2 hours ago

        Hat off.-

        PS. Also, if I may l, thanks for the walkthrough - I'd be clapping with just the short form at the end, but the reasoning is appreciated.-

  • jzelinskie 6 hours ago

    This is an incredibly useful one-liner. Thank you for sharing!

    I'm a big fan of jq, having written my own jq wrapper that supports multiple formats (github.com/jzelinskie/faq), but these days I find myself more quickly reaching for Python when I get any amount of complexity. Being able to use uv scripts in Python has considerably lowered the bar for me to use it for scripting.

    Where are you drawing the line?

    • rwiggins 5 hours ago

      Hmm. I stick to jq for basically any JSON -> JSON transformation or summarization (field extraction, renaming, etc.). Perhaps I should switch to scripts more. uv is... such a game changer for Python, I don't think I've internalized it yet!

      But as an example of about where I'd stop using jq/shell scripting and switch to an actual program... we have a service that has task queues. The number of queues for an endpoint is variable, but enumerable via `GET /queues` (I'm simplifying here of course), which returns e.g. `[0, 1, 2]`. There was a bug where certain tasks would get stuck in a non-terminal state, blocking one of those queues. So, I wanted a simple little snippet to find, for each queue, (1) which task is currently executing and (2) how many tasks are enqueued. It ended up vaguely looking like:

          for q in $(curl -s "$endpoint/queues" | jq -r '.[]'); do
              curl -s "$endpoint/queues/$q" \
              | jq --arg q "$q" '
                  {
                      "queue": $q,
                      "executing": .currently_executing_tasks,
                      "num_enqueued": (.enqueued_tasks | length)
                  }'
          done | jq -s
      
      
      which ends up producing output like (assuming queue 0 was blocked)

          [
              {
                  "queue": 0,
                  "executing": [],
                  "num_enqueued": 100
              },
              ...
          ]
      
      I think this is roughly where I'd start to consider "hmm, maybe a proper script would do this better". I bet the equivalent Python is much easier to read and probably not much longer.

      Although, I think this example demonstrates how I typically use jq, which is like a little multitool. I don't usually write really complicated jq.

    • dotancohen 2 hours ago

      I could Google it, but tell a bit more about uv scripts. Isn't uv a package manager like pip?

      • easton an hour ago

        uv has a feature where you can put a magic comment at the top of a script and it will pull all the dependencies into its central store when you do “uv run …”. And then it makes a special venv too I think? That part’s cloudier.

        https://docs.astral.sh/uv/guides/scripts/

        Makes it a snap to have a one file python script without having to explicitly pip install requests or whatever into a venv.

        • wonger_ 43 minutes ago

          Example usage for those who haven't seen it yet:

            #!/usr/bin/env -S uv run --script
            #
            # /// script
            # requires-python = ">=3.12"
            # dependencies = ["httpx"]
            # ///
            
            import httpx
            
            print(httpx.get("https://example.com"))
    • Bluestein 6 hours ago

      May I also add this ain't a mere one liner. It's a masterclass!

  • jdc0589 5 hours ago

    this is a super useful oneliner, immediately saved to my bash profile as `jqstructure`

Hendrikto 6 hours ago

> Side note: Ever tried Googling for "jq formatter"? Reading search results is a nightmare since jq itself is, among other things, a formatter.

That’s what I thought too, when I read the title. To clarify: This tool formats jq commands, not JSON itself.

  • vanschelven 5 hours ago

    Which makes sense because jq, with no options, acts as a formatter by default. (it's about 50% of my jq usage).

  • layer8 5 hours ago

    While it doesn’t help much for search in this case, the more specific term is “pretty-printer”.

s17n 4 hours ago

If you need to format your one-liner, maybe it shouldn't be a one liner?

Anyway whether or not this tool is advisable its definitely cool, nice work!

  • noperator 3 hours ago

    My prototype one-liners usually turn into Go programs :)

  • Bluestein 4 hours ago

    > If you need to format your one-liner, maybe it shouldn't be a one liner?

    Entirely correct, this point.-

    PS. May I also appreciate your comment, as far as form? You made both, valid, points.-

kiitos 5 hours ago

Instead of making users enable every formatting rule explicitly e.g.

    jqfmt -ob -ar -op pipe
It would be better if the tool enabled a common set of rules by default, so that `echo ... | jqfmt` actually did something useful :)
ForOldHack 11 minutes ago

Stop naming products after falling silverware.

jq is sed for json data. gofmt is a GO source formatter jqfmt is like gofmt a go source formatter, but for json. So jqfmt is really json beautifier...

Anyone with an ASR-33 for sale? rq?

xmonkee 5 hours ago

God I really abhor jq and it seems it's becoming a standard. I dislike it cause I'm too dumb to correctly dredge up it's incantations, and once a year I have to go reading their arcane docs. I suppose it's another fertile ground for LLM use.

  • mdaniel 4 hours ago

    The bad news is that much like how "I'm just going to DSL this ..." inevitably morphs into a full-blown programming language[1], so too is the ubiquitous "gah, your language is too complex, I'm going to just use this other tool that implements my favorite 10% of the cases"

    which is a long way of saying: or else what? There's 100% no way that I'm going to ever, ever use <<python3 -c "import json, sys; print(json.load(sys.stdin)[...ohgawd...]">> and if you are, then more power to ya and jq apparently doesn't solve a problem you have

    1: https://www.laws-of-software.com/laws/zawinski/

  • benreesman an hour ago

    It's a pretty good on/off-ramp into better tools. Going from arbitrary slop to something that's a reasonable input to `nixlang` or Dhall is pure win IMHO.

    I get a lot of use out of `jq` even though I prefer sounder systems than JSON.

  • pxc 3 hours ago

    What would "non-arcane" jq docs look like? I'm kind of in the same boat, being an infrequent jq user, but I've generally found the docs pretty easy to navigate.

  • quotemstr 35 minutes ago

    Hey. Don't hate on jq too much. It's a backdoor way to get functional programming past people's mental perceived complexity forcefields.

  • ashwinsundar 3 hours ago

    A standard for what? It just makes JSON look nicer and more query-able. You don't have to use it.

    • xmonkee 3 hours ago

      A standard as in there is a cottage industry of tools and websites built around it now, like this one.

      • lxgr an hour ago

        Given the choice between a hypothetical standard that nobody wrote (or implemented) and a tool that organically grew complex enough to benefit from a standard, I'd rather have the latter.

        Users (i.e. not implementors) usually also don't read the standard – they read the docs (ideally containing lots of examples on top of a dry enumeration of options), or today indeed ask an LLM.

mikeocool 4 hours ago

Been using fx (fx.wtf) as alternative to jq recently.

Give you a nice javascript interface to do similar types of processing to what I would do with jq.

guerrilla 5 hours ago

I thought this title was rot13 at first. :D

  • Bluestein 5 hours ago

    Gubhtug V jnf gur bayl bar :)

    PS. Honestly, it's pretty close.-

quotemstr 38 minutes ago

jq is convenient, but I don't see the draw in building data processing pipelines on it. It's like writing complex software in shell.

Recently, I found myself wanting to do a join by filename on two sets of about 300,000 files. Tried bashing my head against jq with INDEX and various tricks and couldn't get the runtime below minutes.

Then I just gave up, fired up Python, loaded the dataset into Pandas, and did a join. Completed too fast to notice.