如何从JSON中提取嵌套的值()

我有下面的代码, 输出中也需要状态和* policy工作流名称(即星期二), 任何人都可以帮忙。

import json import requests import pandas as pd from pandas.io.json import json_normalize import ast pd.set_option('display.max_columns', None) pd.set_option('display.max_rows', None) pd.set_option('display.width', 100)a = [{'attributes': [{'key': '*policy action jobid', 'values': ['289903']}, {'key': '*policy action name', 'values': ['backup']}, {'key': '*policy name', 'values': ['Daily_Backups']}, {'key': '*policy workflow name', 'values': ['tuesday']}, {'key': 'clone retention policy', 'values': ['504:5:34']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:41+08:00', 'clientHostname': 'xyz.com', 'clientId': '7d391c52-00000004-5cda459d-5c1', 'creationTime': '2020-04-28T21:29:25+08:00', 'fileCount': 0, 'id': '1eb1', 'instances': [], 'level': 'Full', 'links': [{'href': 'https://iservera/backups/1ec1', 'rel': 'item'}], 'name': '/abc', 'retentionTime': '2020-05-19T23:57:41+08:00', 'saveTime': '2020-04-28T21:27:07+08:00', 'shortId': '2177', 'size': {'unit': 'Byte', 'value': 0}, 'type': 'File'}, {'attributes': [{'key': '*policy action jobid', 'values': ['2803']}, {'key': '*policy action name', 'values': ['backup: 1589']}, {'key': '*policy name', 'values': ['Daily_Backups: 159']}, {'key': '*policy workflow name', 'values': ['tuesday: 1588079529']}, {'key': '*ss clone retention', 'values': ['1588079529:1588079590:1824409']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:42+08:00', 'clientHostname': 'abc.com', 'clientId': 'ec3dc1', 'completionTime': '2020-04-28T21:29:47+08:00', 'creationTime': '2020-04-28T21:13:10+08:00', 'fileCount': 0, 'id': 'cc1', 'instances': [{'clone': False, 'id': '1588079529', 'status': 'Aborted', 'volumeIds': ['245614341']}], 'level': 'Full', 'links': [{'href': 'https://abc/backups/c771', 'rel': 'item'}], 'name': '/xyz', 'retentionTime': '2020-05-19T23:57:42+08:00', 'saveTime': '2020-04-28T21:10:53+08:00', 'shortId': '2141727718', 'size': {'unit': 'Byte', 'value': 36264099844}, 'type': 'NDMP'}] df = json_normalize(a) a = df[['clientHostname', 'completionTime', 'size.value', 'type', 'fileCount']] print(a)```Current output is- ```clientHostnamecompletionTimesize.valuetypefileCount 0xyz.comNaN0File0 1abc.com2020-04-28T21:29:47+08:0036264099844NDMP0```Output needed is- ```clientHostnamecompletionTimesize.valuetypefileCount statusPolicy 0xyz.comNaN0File0-tuesday 1abc.com2020-04-28T21:29:47+08:0036264099844NDMP0Aborted tuesday```

#1你可以获取策略工作流名称为df [‘ attributes’ ] [0] [3] [‘ key’ ], df [‘ attributes’ ] [0] [3] [‘ values’ ]和状态为df [‘ instances ‘ ] [i] [0] [‘ status’ ], 其中i是记录号。
#2我将使用库jmespath遍历json数据:
要访问密钥, 请使用。要访问列表, 请使用[]表示法。
import jmespath expression = jmespath.compile(""" []. {clientHostname:clientHostname, completionTime:completionTime, "size.value":size.value, type:type, fileCount:fileCount, status:instances[].status, Policy:attributes[?key==`*policy workflow name`].values[]} """) res = expression.search(a) res[{'clientHostname': 'xyz.com', 'completionTime': None, 'size.value': 0, 'type': 'File', 'fileCount': 0, 'status': [], 'Policy': ['tuesday']}, {'clientHostname': 'abc.com', 'completionTime': '2020-04-28T21:29:47+08:00', 'size.value': 36264099844, 'type': 'NDMP', 'fileCount': 0, 'status': ['Aborted'], 'Policy': ['tuesday: 1588079529']}]

进行一些清理以适合你的用例:
#get the day of the week for policy keys res = [{key:value[0].split(':')[0] if key=="Policy" else value for key, value in ent.items()} for ent in res]#if list is empty, replace it with "-" res = [{key:"-" if (isinstance(value, list) and not value) else value for key, value in ent.items()} for ent in res] pd.DataFrame(res)clientHostnamecompletionTimesize.valuetypefileCountstatusPolicy 0xyz.com None0File0-tuesday 1abc.com 2020-04-28T21:29:47+08:0036264099844 NDMP0[Aborted]tuesday

【如何从JSON中提取嵌套的值()】你可能需要查看数据, 看看是否需要进行其他转换

    推荐阅读