{"expand":"renderedFields,names,schema,operations,editmeta,changelog,versionedRepresentations","id":"46038","self":"https://jira.geedge.net/rest/api/2/issue/46038","key":"OMPUB-1446","fields":{"issuetype":{"self":"https://jira.geedge.net/rest/api/2/issuetype/10004","id":"10004","description":"","iconUrl":"https://jira.geedge.net/secure/viewavatar?size=xsmall&avatarId=10303&avatarType=issuetype","name":"故障","subtask":false,"avatarId":10303},"components":[],"timespent":null,"timeoriginalestimate":null,"description":null,"project":{"self":"https://jira.geedge.net/rest/api/2/project/10206","id":"10206","key":"OMPUB","name":"Operation and Maintenance","projectTypeKey":"business","avatarUrls":{"48x48":"https://jira.geedge.net/secure/projectavatar?pid=10206&avatarId=10715","24x24":"https://jira.geedge.net/secure/projectavatar?size=small&pid=10206&avatarId=10715","16x16":"https://jira.geedge.net/secure/projectavatar?size=xsmall&pid=10206&avatarId=10715","32x32":"https://jira.geedge.net/secure/projectavatar?size=medium&pid=10206&avatarId=10715"},"projectCategory":{"self":"https://jira.geedge.net/rest/api/2/projectCategory/10002","id":"10002","description":"系统运维","name":"MaintenanceDev"}},"fixVersions":[],"aggregatetimespent":null,"resolution":null,"timetracking":{},"customfield_10401":null,"customfield_10104":null,"customfield_10402":null,"customfield_10105":"0|i05ym4:","customfield_10403":null,"customfield_10404":null,"attachment":[{"self":"https://jira.geedge.net/rest/api/2/attachment/62173","id":"62173","filename":"image-2024-09-03-10-45-22-623.png","author":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"created":"2024-09-03T10:45:24.380+0800","size":1462257,"mimeType":"image/png","content":"https://jira.geedge.net/secure/attachment/62173/image-2024-09-03-10-45-22-623.png","thumbnail":"https://jira.geedge.net/secure/thumbnail/62173/_thumb_62173.png"},{"self":"https://jira.geedge.net/rest/api/2/attachment/62174","id":"62174","filename":"image-2024-09-03-11-03-22-597.png","author":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"created":"2024-09-03T11:03:22.717+0800","size":131145,"mimeType":"image/png","content":"https://jira.geedge.net/secure/attachment/62174/image-2024-09-03-11-03-22-597.png","thumbnail":"https://jira.geedge.net/secure/thumbnail/62174/_thumb_62174.png"},{"self":"https://jira.geedge.net/rest/api/2/attachment/62175","id":"62175","filename":"image-2024-09-03-11-07-07-174.png","author":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"created":"2024-09-03T11:07:07.303+0800","size":82677,"mimeType":"image/png","content":"https://jira.geedge.net/secure/attachment/62175/image-2024-09-03-11-07-07-174.png","thumbnail":"https://jira.geedge.net/secure/thumbnail/62175/_thumb_62175.png"}],"aggregatetimeestimate":null,"resolutiondate":null,"workratio":-1,"summary":"BOL-IGW站点多台NPB设备产生TSG-OS  container restart告警","lastViewed":null,"watches":{"self":"https://jira.geedge.net/rest/api/2/issue/OMPUB-1446/watchers","watchCount":4,"isWatching":false},"creator":{"self":"https://jira.geedge.net/rest/api/2/user?username=songlongkun","name":"songlongkun","key":"JIRAUSER10914","emailAddress":"songlongkun@geedgenetworks.com","avatarUrls":{"48x48":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=48","24x24":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=24","16x16":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=16","32x32":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=32"},"displayName":"宋龙坤","active":true,"timeZone":"Asia/Shanghai"},"subtasks":[],"created":"2024-09-02T20:06:19.585+0800","reporter":{"self":"https://jira.geedge.net/rest/api/2/user?username=songlongkun","name":"songlongkun","key":"JIRAUSER10914","emailAddress":"songlongkun@geedgenetworks.com","avatarUrls":{"48x48":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=48","24x24":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=24","16x16":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=16","32x32":"https://www.gravatar.com/avatar/643f4935f43167fd773a6b701f9fd05b?d=mm&s=32"},"displayName":"宋龙坤","active":true,"timeZone":"Asia/Shanghai"},"customfield_10000":"{summaryBean=com.atlassian.jira.plugin.devstatus.rest.SummaryBean@a61b6e4[summary={pullrequest=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@4ca920b7[overall=PullRequestOverallBean{stateCount=0, state='OPEN', details=PullRequestOverallDetails{openCount=0, mergedCount=0, declinedCount=0}},byInstanceType={}], build=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@4670fd63[overall=com.atlassian.jira.plugin.devstatus.summary.beans.BuildOverallBean@4359c182[failedBuildCount=0,successfulBuildCount=0,unknownBuildCount=0,count=0,lastUpdated=<null>,lastUpdatedTimestamp=<null>],byInstanceType={}], review=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@7101691c[overall=com.atlassian.jira.plugin.devstatus.summary.beans.ReviewsOverallBean@3d45f284[stateCount=0,state=<null>,dueDate=<null>,overDue=false,count=0,lastUpdated=<null>,lastUpdatedTimestamp=<null>],byInstanceType={}], deployment-environment=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@7c1f8baa[overall=com.atlassian.jira.plugin.devstatus.summary.beans.DeploymentOverallBean@7147107a[topEnvironments=[],showProjects=false,successfulCount=0,count=0,lastUpdated=<null>,lastUpdatedTimestamp=<null>],byInstanceType={}], repository=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@b74cec7[overall=com.atlassian.jira.plugin.devstatus.summary.beans.CommitOverallBean@19b7dcff[count=0,lastUpdated=<null>,lastUpdatedTimestamp=<null>],byInstanceType={}], branch=com.atlassian.jira.plugin.devstatus.rest.SummaryItemBean@19f556b8[overall=com.atlassian.jira.plugin.devstatus.summary.beans.BranchOverallBean@50f8413c[count=0,lastUpdated=<null>,lastUpdatedTimestamp=<null>],byInstanceType={}]},errors=[],configErrors=[]], devSummaryJson={\"cachedValue\":{\"errors\":[],\"configErrors\":[],\"summary\":{\"pullrequest\":{\"overall\":{\"count\":0,\"lastUpdated\":null,\"stateCount\":0,\"state\":\"OPEN\",\"details\":{\"openCount\":0,\"mergedCount\":0,\"declinedCount\":0,\"total\":0},\"open\":true},\"byInstanceType\":{}},\"build\":{\"overall\":{\"count\":0,\"lastUpdated\":null,\"failedBuildCount\":0,\"successfulBuildCount\":0,\"unknownBuildCount\":0},\"byInstanceType\":{}},\"review\":{\"overall\":{\"count\":0,\"lastUpdated\":null,\"stateCount\":0,\"state\":null,\"dueDate\":null,\"overDue\":false,\"completed\":false},\"byInstanceType\":{}},\"deployment-environment\":{\"overall\":{\"count\":0,\"lastUpdated\":null,\"topEnvironments\":[],\"showProjects\":false,\"successfulCount\":0},\"byInstanceType\":{}},\"repository\":{\"overall\":{\"count\":0,\"lastUpdated\":null},\"byInstanceType\":{}},\"branch\":{\"overall\":{\"count\":0,\"lastUpdated\":null},\"byInstanceType\":{}}}},\"isStale\":false}}","aggregateprogress":{"progress":0,"total":0},"customfield_10100":null,"priority":{"self":"https://jira.geedge.net/rest/api/2/priority/3","iconUrl":"https://jira.geedge.net/images/icons/priorities/medium.svg","name":"Medium","id":"3"},"customfield_10200":null,"customfield_10400":null,"labels":["E21现场"],"environment":null,"timeestimate":null,"aggregatetimeoriginalestimate":null,"versions":[],"duedate":null,"progress":{"progress":0,"total":0},"issuelinks":[],"comment":{"comments":[{"self":"https://jira.geedge.net/rest/api/2/issue/46038/comment/85415","id":"85415","author":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"body":"现场\r\n * Bole-IGW两组共计10个NPB，除Bole-IGW T9K02 NPB05外，其余所有traffic_engine均在9月1日00:40~00:42之间出现重启，触发现场为marsio报no mbuf\r\n\r\n分析\r\n * 检查触发重启的NPB日志\r\n ** firewall进程在00:42均出现CPU使用超过99%的告警，范围为所有包处理核\r\n\r\n!image-2024-09-03-10-45-22-623.png|width=536,height=179!\r\n * \r\n ** 未触发重启的NPB05，在同一时刻也有上述CPU使用超99%的告警，持续2s后消失，推测该NPB当时剩余mbuf较其他节点稍多，因此未触发marsio的no mbuf，同时该NPB也并未触发死锁检测告警。\r\n ** 由于firewall所有包处理线程CPU使用率接近100%，导致未及时处理marsio缓冲队列中的mbuf，大部分NPB在1s内触发marsio no mbuf告警\r\n * 检查未重启的NPB监控\r\n ** 重启时段为流量低谷期，在00:41左右明显出现udp新建上涨，同时monitor命中也从每秒100~300突增至9k+\r\n\r\n!image-2024-09-03-11-03-22-597.png|width=389,height=266!!image-2024-09-03-11-07-07-174.png|width=393,height=277!\r\n\r\n结论\r\n * 综上，根据现场的日志和监控，推测重启的原因为9.1 00:41时刻，Bole-IGW站点所有NPB收到突增的UDP流量，同时大量命中monitor策略，所有处理线程CPU使用超过99%，触发marsio no mbuf告警导致所有pod重启。\r\n\r\n \r\n\r\n问题\r\n * 突发流量持续时间较短（日志显示约1~2s），除NPB05外，其余NPB未到达overload protection最小检测周期1s即已触发marsio no mbuf，firewall需要考虑调整过载保护的检测周期至更细粒度。\r\n * 考虑增加对monitor策略命中速率的限制\r\n\r\n ","updateAuthor":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"created":"2024-09-03T11:20:14.754+0800","updated":"2024-09-03T11:20:14.754+0800"}],"maxResults":1,"total":1,"startAt":0},"votes":{"self":"https://jira.geedge.net/rest/api/2/issue/OMPUB-1446/votes","votes":0,"hasVoted":false},"worklog":{"startAt":0,"maxResults":20,"total":0,"worklogs":[]},"assignee":{"self":"https://jira.geedge.net/rest/api/2/user?username=yangwei","name":"yangwei","key":"JIRAUSER10103","emailAddress":"yangwei@geedgenetworks.com","avatarUrls":{"48x48":"https://jira.geedge.net/secure/useravatar?ownerId=JIRAUSER10103&avatarId=10708","24x24":"https://jira.geedge.net/secure/useravatar?size=small&ownerId=JIRAUSER10103&avatarId=10708","16x16":"https://jira.geedge.net/secure/useravatar?size=xsmall&ownerId=JIRAUSER10103&avatarId=10708","32x32":"https://jira.geedge.net/secure/useravatar?size=medium&ownerId=JIRAUSER10103&avatarId=10708"},"displayName":"杨威","active":true,"timeZone":"Asia/Shanghai"},"updated":"2024-09-05T19:02:39.821+0800","status":{"self":"https://jira.geedge.net/rest/api/2/status/1","description":"问题已经准备好让经办人开始处理。","iconUrl":"https://jira.geedge.net/images/icons/statuses/open.png","name":"开放","id":"1","statusCategory":{"self":"https://jira.geedge.net/rest/api/2/statuscategory/2","id":2,"key":"new","colorName":"blue-gray","name":"待办"}}}}