Scientific Linux Forum.org



  Reply to this topicStart new topicStart Poll

> Major infrastructure crash after upgrading to rpcbind-0.2.0-13.el6_9.x86_64
renaud
 Posted: May 24 2017, 11:52 AM
Quote Post


SLF Newbie


Group: Members
Posts: 6
Member No.: 3876
Joined: 24-May 17









Hi Everybody,

We are nightly mirroring a repository of sl6 distribution on a local server and our machines (servers and workstation) are upgrading through yum on this repository.

On this night (20170524) around 4:00 AM, our sl6 nis servers (master and slaves) and our nis/nfs clients rpcbind packages was upgraded from rpcbind-0.2.0-13.el6.x86_64 to rpcbind-0.2.0-13.el6_9.x86_64. Nfs servers are NetApps.

When I connect this morning, every rpcbind processes was dead (on about 150 machines) on th same vlan than the nis master. No issue on the others vlans (I have a slave on every vlan), but there are less machines on those vlans.

I tryed to restart rpcbind service, autofs services, ypbind services, but rpcbind crash after some time (btw 5 and 60 minutes).

I had to downgrade the package on th entire machines on the vlan of my nis master, in order to stabilize the situation.

I guess, the package rpcbind-0.2.0-13.el6_9.x86_64 is a "bad patch".

Is there other people who ad the same bad experience with this patch ? If yes, how did you managed this issue ?

Thanks for your reply.

Best Regards, Renaud.
PM
^
renaud
 Posted: May 24 2017, 12:36 PM
Quote Post


SLF Newbie


Group: Members
Posts: 6
Member No.: 3876
Joined: 24-May 17









PM
^
burakkucat
 Posted: May 24 2017, 12:51 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









QUOTE (renaud @ May 24 2017, 12:36 PM)
Bug referenced at RedHat side:

https://bugzilla.redhat.com/show_bug.cgi?id=1454876


Yes, that's the one. I was just about to post the link when I saw that you had already discovered the upstream bug report.

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
renaud
 Posted: May 24 2017, 01:54 PM
Quote Post


SLF Newbie


Group: Members
Posts: 6
Member No.: 3876
Joined: 24-May 17









Hello,

Thanks for you answer. I wonder why there is not the same bugzilla opened for RHEL6.9, like it is mentioned in the RHEL 7.3 one, comment 6 : https://bugzilla.redhat.com/show_bug.cgi?id=1454876#c6

Regards, Renaud.
PM
^
burakkucat
 Posted: May 24 2017, 10:40 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









There is a Red Hat Knowledge Base entry [1], which discloses bug tracker entries for both RHEL 7.3 [2] and RHEL 6.9 [3] --

QUOTE

rpcbind crashes after update of CVS-2017-8779 when using ypbind

Solution In Progress - Updated 30 minutes ago - English

Environment

    Red Hat Enterprise Linux 6
        seen on rpcbind-0.2.0-13.el6_9.x86_64
    Red Hat Enterprise Linux 7
        seen on rpcbind-0.2.0-38.el7_3.x86_64
    ypbind

Issue

After install of rpcbind-0.2.0-13.el6_9.x86_64 (distributed as security update of RHSA-2017-1262-1) stops after some rpcinfo -p execution by linked glibc memory check.

Resolution

Red Hat Enterprise Linux 7

    A solution to this problem is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1454876

Red Hat Enterprise Linux 6

    A solution to this problem is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1455142

Root Cause

    Under investigation

    Product(s) Red Hat Enterprise Linux

    Category Troubleshoot

    Tags nfs3

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.


[1] https://access.redhat.com/solutions/3053461
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1454876
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1455142

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
renaud
 Posted: May 30 2017, 08:24 AM
Quote Post


SLF Newbie


Group: Members
Posts: 6
Member No.: 3876
Joined: 24-May 17









Hi All,

Wonderful contribution of RedHat this morning in https://access.redhat.com/solutions/3053461
"As a workaround, downgrade the package to the earlier version"

I persist wondering why the bad security patch is still present on RedHat and SL repositories

Best regards, Renaud.
PM
^
renaud
 Posted: May 30 2017, 09:21 AM
Quote Post


SLF Newbie


Group: Members
Posts: 6
Member No.: 3876
Joined: 24-May 17









Hello,

I noticed that on machines patched with fastbug glibc installed, I cannot reproduce the issue, so I added a comment on https://access.redhat.com/solutions/3053461

Hello,

I had to patch glibc with fastbug patch because I had a problem running Ansys/hfss with the standard glibc provided with RHEL 6.9 (I had not the problem with RHEL 6.8). I only update glibc on my cluster compute nodes. I noticed that on machines with fastbug glibc, I'm not able to reproduce the issue.

My compute node:

[root@aar057 ~]# rpm -q glibc.x86_64 rpcbind.x86_64
glibc-2.12-1.209.el6_9.1.x86_64
rpcbind-0.2.0-13.el6_9.x86_64
[root@aar057 ~]# rpcbind -d -w
...
polling for read on fd < 5 6 7 8 9 10 11 >
polling for read on fd < 5 6 7 8 9 10 11 >
polling for read on fd < 5 6 7 8 9 10 11 >
polling for read on fd < 5 6 7 8 9 10 11 >

My workstation:

[root@kalahari ~]# rpm -q glibc.x86_64 rpcbind.x86_64
glibc-2.12-1.209.el6.x86_64
rpcbind-0.2.0-13.el6_9.x86_64
[root@kalahari ~]# rpcbind -d -w
...
7f665608f000-7f6656090000 rw-p 0000c000 fd:00 151559 /sbin/rpcbind
7f6656090000-7f6656091000 rw-p 00000000 00:00 0
7f66576be000-7f66576df000 rw-p 00000000 00:00 0 [heap]
7ffe0b35d000-7ffe0b386000 rw-p 00000000 00:00 0 [stack]
7ffe0b3c3000-7ffe0b3c4000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted

Another difference btw the two machines is the nscd is activated on compute node but not on workstation:

[root@aar057 ~]# service nscd status
nscd (pid 10037) is running...

[root@kalahari ~]# service nscd status
nscd: unrecognized service

Hope this contibute solving this issue.

Best regards, Renaud.
PM
^
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

Topic Options Reply to this topicStart new topicStart Poll