Scientific Linux Forum.org



  Reply to this topicStart new topicStart Poll

> Another 304xx driver Question
kenmorgan
 Posted: Feb 18 2014, 04:35 PM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









OS: CentOS 6.3
Graphics Card: GeForce 6200
Number of Monitors: 2

A while back, several of you on this forum were very helpful to me with Nvidia driver problems. The somewhat lengthy thread is here:

http://scientificlinuxforum.org/index.php?showtopic=1825&hl=

However, the part of this thread relevant to my new problem begins with my "Jan 27 2013, 02:44 AM" post. In reviewing that material again, I noticed that toracat joined in with a very helpful post. If he sees me over here again, though, I might be in trouble back on the CentOS forum! smile.gif

The conclusion of that thread was that for my graphics card, I needed to use the legacy 304xx driver found in the elrepo repository. That is still what I am using.

Yesterday, I posted a problem on the Mozillazine Firefox forum that I was having with Firefox. The sad story can be found here:

http://forums.mozillazine.org/viewtopic.php?f=38&t=2801957&sid=f4458107903a31c87c871fb450e0957a

I had been running the CentOS ESR repo version 24, but I wanted to try the native "latest and greatest" version from the Mozilla Web site. So I installed 27.0.1, but in a very short time I ran into a problem: it continually crashed when going to one particular Web site. If I go here--

http://www.ifcj.org/site/PageNavigator/eng/rabbi/jtts/isaiah_3/isaiah3_l3

and start pulling up "lessons" from the left-hand column, Firefox usually crashes around lesson 4. But it also crashes on the previous page,

http://www.ifcj.org/site/PageNavigator/eng/jtts

It does, in fact, crash EVERYTIME I go to one of those two pages. The Mozilla crash report is here:

https://crash-stats.mozilla.com/report/index/dd9de1a9-e8c0-4e68-8263-5a1552140218

Apparently, the module libnvidia-glcore.so is the problem.

I'm wondering why crashing doesn't occur when I go to other Web sites. However, the big question is whether there is a solution to this problem. I need to be able to go to that Web site.

Your help would be greatly appreciated.

Ken




--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 18 2014, 09:16 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









It might be worthwhile confirming that only one copy of the shared object file (libnvidia-glcore.so) is installed and from which package it originates.

The output from the following command line may be useful --

rpm -qf $(find / -xdev -name libnvidia-glcore.\* 2>/dev/null)

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
kenmorgan
 Posted: Feb 18 2014, 09:42 PM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Thanks for the response. Here's the output:

CODE

[root@localhost1 kmorgan]# rpm -qf $(find / -xdev -name libnvidia-glcore.\* 2>/dev/null)
nvidia-x11-drv-304xx-304.117-1.el6.elrepo.i686
[root@localhost1 kmorgan]#



--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 18 2014, 09:57 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









So we can see that there is only one version of that shared object file present and it does originate from the nvidia-x11-drv-304xx package.

That libnvidia-glcore.so file has been built by nVidia. It has only been packaged by the ELRepo Project.

I've given our toracat a bump, so let's now wait to see if there is any further feline advice. rolleyes.gif

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
toracat
 Posted: Feb 18 2014, 10:40 PM
Quote Post


SLF Geek
****

Group: Members
Posts: 303
Member No.: 11
Joined: 10-April 11









@kenmorgan

I saw in one of the links you referenced that you mentioned, "I need Nvidia (I think) because the driver needs to be able to run two monitors". You don't need to use Nvidia for that. More details can be found in this CentOS FAQ.

--------------------
ELRepo: repository specializing in hardware support for EL
PMUsers Website
^
kenmorgan
 Posted: Feb 19 2014, 12:31 AM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Thanks, toracat. I looked at the link you suggested. Just so I can ask some questions, here's the output to the xrandr command:

"rest and relaxation for the x window system" ? smile.gif

CODE

[kmorgan@localhost1 ~]$ xrandr -q
Screen 0: minimum 8 x 8, current 3520 x 1200, maximum 4096 x 4096
VGA-0 connected 1920x1200+1600+0 (normal left inverted right x axis y axis) 518mm x 324mm
  1920x1200      60.0*+   59.9  
  1680x1050      60.0  
  1600x1200      60.0  
  1600x1000      60.0  
  1440x900       59.9  
  1280x1024      75.0  
  1280x960       60.0  
  1280x720       60.0  
  1024x768       75.0     60.0  
  800x600        75.0     60.3  
  640x480        75.0     59.9  
DVI-I-0 connected 1600x900+0+53 (normal left inverted right x axis y axis) 442mm x 249mm
  1600x900       60.0*+
  1440x900       59.9  
  1280x1024      60.0  
  1024x768       60.0  
  800x600        60.3  
  640x480        59.9  
TV-0 disconnected (normal left inverted right x axis y axis)
DVI-I-1 disconnected (normal left inverted right x axis y axis)
[kmorgan@localhost1 ~]$

Now just so I don't mess something up and suddenly lose both monitors, here are some questions:

(1) From the help you and tux99 gave me at the end of the old thread on this forum (cited as the first url in my first post on this thread), I thought we were saying that because of my old GeForce 6200 graphics card, I needed to use either Noveau or Nvidia drivers because the chips on the card were Nvidia chips. I must have misunderstood. So that's really not the case then?

(2) What happens to the Nvidia legacy 304xx driver when I issue the xrandr command? Should I use yum to uninstall it first? Will I still have at least one monitor working when I uninstall 304xx?

(3) After issuing the xrandr command, what driver is really "driving" the graphics card?

(4) And just for my education, do you have any idea why the Nvidia driver made Firefox crash at that one Web site?

Thanks so much. I hope I'm not asking for a huge amount of time here.

Ken

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 19 2014, 10:30 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









Ken -- I have been thinking about your problem.

If your usage of the nVidia driver (and all its baggage) is solely the result of the need to configure dual monitors, then I would recommend that you remove the two packages and make use of the native xorg-x11 drivers in conjunction with the xrandr command.

Recipe --

(1) xrandr -q and note how the dual monitors are currently configured.
(2) yum remove kmod-nvidia-304xx nvidia-x11-drv-304xx
(3) Re-boot.
(4) Test various incantations of xrandr, using the CentOS FAQ as a guide.
(5) Once the correct xrandr command line has been identified, insert it into the /etc/gdm/Init/Default file immediately before the closing line (exit 0).
(6) Re-boot. The system should now "automagically" have both monitors operational.

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
kenmorgan
 Posted: Feb 20 2014, 01:05 AM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Something unexpected occurred. I got as far as step 3. The system will not reboot. I was able to pick up a few things because I have "quiet" removed in grub.conf. The following description is duplicated on both monitors.

First, there's a long period of a blank screen with an small underscore character in the upper left-hand corner. Finally, the boot process starts. Literally hundreds of Nouveau error lines flash on the screen at an incredible speed. Then the screen becomes a solid color. After a little wait, normal-looking boot lines begin. But it reaches a certain point where Nouveau error lines come up again, but this time at the normal rate where I can read a little. All the lines start with "nouveau." Then some have "DMA_Pusher." These also have "invalid_cmd" or "mem_fault." Some lines have "failed to idle channel," other lines have "cache error." They all appear to give machine addresses. After 20 or 30 such lines, these appear:

CPU0 core temp above threshold CPU clock throttled
CPU0 core temp/speed normal
Hardware error: machine check events logged

Then these three lines repeat with CPU1. It then seems to go into an infinite loop.

I shut the computer off and disconnected one of the monitors and tried rebooting. Results are very similar, except it eventually hangs rather than loops.

Any ideas? I surely would have thought that Nouveau could have run one monitor!

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 20 2014, 01:31 AM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









It does seem like a bit of a coincidence but the message you have shown implies that there has been a hardware error.

QUOTE

CPU0 core temp above threshold CPU clock throttled
CPU0 core temp/speed normal
Hardware error: machine check events logged

The /var/log/mcelog file should contain the details.

From the mcelog manual page --

QUOTE

DESCRIPTION
       X86  CPUs  report errors detected by the CPU as machine check events (MCEs).  These can be data corruption detected in the CPU caches,
       in main memory by an integrated memory controller, data transfer errors on the front side bus or CPU interconnect  or  other  internal
       errors.  Possible causes can be cosmic radiation, instable power supplies, cooling problems, broken hardware, or bad luck.

       Most  errors  can  be  corrected by the CPU by internal error correction mechanisms. Uncorrected errors cause machine check exceptions
       which may panic the machine.

       When a corrected error happens the x86 kernel writes a record describing the MCE into a internal ring  buffer  available  through  the
       /dev/mcelog device mcelog retrieves errors from /dev/mcelog, decodes them into a human readable format and prints them on the standard
       output or optionally into the system log.

       Optionally it can also take more options like keeping statistics or triggering shell scripts on specific events.

       The normal operating modi for mcelog are running as a regular cron job (traditional way, deprecated), running as  a  trigger  directly
       executed by the kernel, or running as a daemon with the --daemon option.

       When  an  uncorrected  machine check error happens that the kernel cannot recover from then it will usually panic the system.  In this
       case when there was a warm reset after the panic mcelog should pick up the machine check errors after reboot.  This  is  not  possible
       after a cold reset.

       In  addition  mcelog  can be used on the command line to decode the kernel output for a fatal machine check panic in text format using
       the --ascii option. This is typically used to decode the panic console output of a fatal machine check, if the system was power cycled
       or mcelog didn’t run immediately after reboot.

       When  the  panic  triggers a kdump kexec crash kernel the crash kernel boot up script should log the machine checks to disk, otherwise
       they might be lost.

       Note that after mcelog retrieves an error the kernel doesn’t store it anymore (different from  dmesg(1)),  so  the  output  should  be
       always saved somewhere and mcelog not run in uncontrolled ways.

You might try bringing the system up into single-user mode and then examine the mcelog file.

At the same time, you could blacklist the nouveau driver. Create a /etc/modprobe.d/blacklist-nouveau.conf file which contains the following one line --

blacklist nouveau

I wonder if toracat is possibly available to assist? ohmy.gif

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
kenmorgan
 Posted: Feb 20 2014, 02:10 AM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









I used the install CD to boot into repair mode. Here are my results:

(1) The mcelog file does not exist.

(2) I created the blacklist file for nouveau (using that rancid VI)

(3) I rebooted off the disk drive.

The boot seemed to be going normally but it simply hung at the following place:

Starting httpd
Starting crond
Starting atd
Starting certmonger

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
kenmorgan
 Posted: Feb 20 2014, 02:56 AM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Sorry, guys. It's almost 10 PM Eastern Time, and at my age one runs out of steam at about that time.

Hopefully we can touch base again tomorrow. I really appreciate all your help.

Ken

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
kenmorgan
 Posted: Feb 20 2014, 10:24 PM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Upon fiddling around, I made a new discovery. Again, I ran the install CD in rescue mode.

As I said above, the /var/log/mcelog file does not exist. However, there were a bunch of other logs in that directory. I looked at Xorg.0.log and found something very interesting. These are the last lines in the log:

NOUVEAU driver for NVIDIA chipset families

(Then there were ten or twelve GeForce numbers listed; 6200 was not one of them)

(++) using VT number 7
(EE) [drm] failed to open device
(EE) No devices detected
Fatal server error: no screens found

Okay: what does this tell us? I thought my blacklist of nouveau meant that nouveau would not be used. It seems here that the booting process was trying to use it. Anyway, what drivers are left if nouveau is not used? Anyway, why doesn't it find the screen? The boot process can sure find it before it hangs.

One other item: it happens that on this computer I have a dual boot system. During good weather when I can have all the windows in the room open to keep the atmosphere from fouling, I can also boot up Windows 7. It comes up fine. That would seem to indicate that there is no hardware problem with the GeForce 6200 card or with any other hardware.

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 24 2014, 10:02 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









Sorry Ken, I have not been about for a while. (The Cattery is based in UTC land, so if the time is late for you it will be silly o'clock for me. rolleyes.gif )

The basic video driver, which should cope with about everything, is the classic vesa.

CODE

[Duo2 ~]$ rpm -qa xorg-x11-drv\* | sort
xorg-x11-drv-acecad-1.5.0-6.el6.x86_64
xorg-x11-drv-aiptek-1.4.1-4.el6.x86_64
xorg-x11-drv-apm-1.2.5-5.el6.x86_64
xorg-x11-drv-ast-0.97.0-2.el6.x86_64
xorg-x11-drv-ati-7.1.0-3.el6.x86_64
xorg-x11-drv-ati-firmware-7.1.0-3.el6.noarch
xorg-x11-drv-cirrus-1.5.2-1.el6_4.x86_64
xorg-x11-drv-dummy-0.3.6-2.el6.x86_64
xorg-x11-drv-elographics-1.4.1-2.el6.x86_64
xorg-x11-drv-evdev-2.7.3-5.el6.x86_64
xorg-x11-drv-fbdev-0.4.3-2.el6.x86_64
xorg-x11-drv-fpit-1.4.0-5.el6.x86_64
xorg-x11-drv-glint-1.2.8-3.el6.x86_64
xorg-x11-drv-hyperpen-1.4.1-4.el6.x86_64
xorg-x11-drv-i128-1.3.6-3.el6.x86_64
xorg-x11-drv-i740-1.3.4-5.el6.x86_64
xorg-x11-drv-intel-2.21.12-2.el6.x86_64
xorg-x11-drv-keyboard-1.6.2-7.el6.x86_64
xorg-x11-drv-mach64-6.9.3-4.1.el6_4.x86_64
xorg-x11-drv-mga-1.6.1-10.el6.x86_64
xorg-x11-drv-modesetting-0.5.0-1.el6.x86_64
xorg-x11-drv-mouse-1.8.1-7.el6.x86_64
xorg-x11-drv-mutouch-1.3.0-4.el6.x86_64
xorg-x11-drv-nouveau-1.0.1-4.el6.x86_64
xorg-x11-drv-nv-2.1.20-4.el6.x86_64
xorg-x11-drv-openchrome-0.3.0-3.20120806git.el6.x86_64
xorg-x11-drv-penmount-1.5.0-4.el6.x86_64
xorg-x11-drv-qxl-0.1.0-7.el6.x86_64
xorg-x11-drv-r128-6.9.1-1.el6.x86_64
xorg-x11-drv-rendition-4.2.5-2.el6.x86_64
xorg-x11-drv-s3virge-1.10.6-2.el6.x86_64
xorg-x11-drv-savage-2.3.6-2.el6.x86_64
xorg-x11-drv-siliconmotion-1.7.7-2.el6.x86_64
xorg-x11-drv-sis-0.10.7-2.el6.x86_64
xorg-x11-drv-sisusb-0.9.6-2.el6.x86_64
xorg-x11-drv-synaptics-1.6.2-13.el6.x86_64
xorg-x11-drv-tdfx-1.4.5-2.el6.x86_64
xorg-x11-drv-trident-1.3.6-4.el6.x86_64
xorg-x11-drv-v4l-0.2.0-6.el6.x86_64
xorg-x11-drv-vesa-2.3.2-4.el6.x86_64
xorg-x11-drv-vmmouse-12.9.0-10.el6.x86_64
xorg-x11-drv-vmware-12.0.2-3.20120718gite5ac80d8f.el6.x86_64
xorg-x11-drv-void-1.4.0-3.el6.x86_64
xorg-x11-drv-voodoo-1.2.5-3.el6.x86_64
xorg-x11-drv-wacom-0.16.1-4.el6.x86_64
xorg-x11-drv-xgi-1.6.0-18.20121114git.el6.x86_64
[Duo2 ~]$


It is interesting to know that when BGW (Billy Gates Ware a.k.a. Windoze) is booted up, the system behaves correctly. From that, I guess we can say that there is no apparent hardware issue. unsure.gif

I am not familiar with your nVidia controller card, so I need to ask if it has one or two video output ports? Is there also a video output port on the motherboard? If the latter is true, I would be tempted to remove the nVidia controller card and connect a monitor directly to the motherboard video output, for testing purposes.

Finally have you tested by allowing the system to boot into runlevel 3 (that is full multi-user mode without the GUI active)?

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
kenmorgan
 Posted: Feb 25 2014, 12:55 AM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Welcome back, burakkucat!

Since the computer was down, I had to do something to get it up again. So I reinstalled the nVidia driver. The computer now boots as usual, but I also received a pleasant surprise. I went to the Web site that was making Firefox crash, and despite clicking about 20 different pages, Firefox never crashed. Since the problem was in an nVidia module, my current theory is that this module had gotten corrupted, and reinstalling the nVidia driver ipso facto replaced that module with an uncorrupted copy!

But to answer your questions:

(1) My graphics card does indeed have two output ports: one VGA and the other one of the those new weird ones that start with an "H" I think. Anyway, I bought a convertor so that my two VGA monitors have a place to plugin.

(2) I didn't know you could run two monitors (where they serve as a "single" screen rather than two duplicates) with one port.

(3) Anyway, with great wisdom and foresight, I bought a motherboard with no video output.

(4) No, I never tried to boot into runlevel 3. I used the CentOS install DVD to boot into "rescue" mode which gives me a (UNIX) prompt, which enabled me to reinstall the nVidia driver.

So...everything is "fixed" (I think). However, nVidia never was a heaven-sent. The xscreensaver doesn't work properly with it, and sometimes upon boot the GNOME desktop comes up "wrong": the menu bar across the top of the screen doesn't work, and my "shortcut" icons appear on both screens, only one set working. I simply keep rebooting, and finally (after two, three, or four times) the desk top opens properly. It always was a complete mystery. Therefore, if vesa will work better, I'm all for trying it.

Questions: do I (1) uninstall nvidia, (2) install vesa, then (3) reboot?

Also, where does xrandr come into this scenario?

Thanks so much.

Ken

--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 25 2014, 05:38 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









That is good news. smile.gif

Now that you have the system operating correctly once again, I will suggest leaving it well alone! wink.gif

My comments, to your answers to my earlier questions --

(1) That will be an HDMI port.
(2) As far as I am aware, that is not possible. (I suspect you misunderstood my plans for testing . . . )
(3) That is not a deficiency but would make testing just a little bit more difficult.
(4) Understood. (There is more that one way to achieve the same result.)

QUOTE

Questions: do I (1) uninstall nvidia, (2) install vesa, then (3) reboot?

(1) No (2) No (3) No. As the system is working, leave well alone!

QUOTE

Also, where does xrandr come into this scenario?

As you are using the nVidia driver, xrandr does not come into the picture.

Say, for example, you replace the nVidia controller card with a simpler controller that just has two VGA ports, then you would use xrandr to enable them both.

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
kenmorgan
 Posted: Feb 25 2014, 06:15 PM
Quote Post


SLF Member
***

Group: Members
Posts: 60
Member No.: 1782
Joined: 8-August 12









Thanks again for such thorough answers! Unfortunately, I thought of some more questions. ohmy.gif

(1) I checked, and vesa IS already installed. How do you make it "take over duties" if nVidia is removed? It certainly didn't do so when I uninstalled nVidia before! (The computer became unbootable.)

(2) Since I know now that I can go back to nVidia if the need should arise, I wouldn't mind experimenting. As I said,

QUOTE

However, nVidia never was a heaven-sent. The xscreensaver doesn't work properly with it, and sometimes upon boot the GNOME desktop comes up "wrong": the menu bar across the top of the screen doesn't work, and my "shortcut" icons appear on both screens, only one set working. I simply keep rebooting, and finally (after two, three, or four times) the desk top opens properly. It always was a complete mystery. Therefore, if vesa will work better, I'm all for trying it.


UNLESS: you think neither of these two problems can ever be corrected when running two monitors.


--------------------
Ken
CentOS 6
PMEmail PosterUsers Website
^
burakkucat
 Posted: Feb 25 2014, 06:38 PM
Quote Post


SLF Administrator
****

Group: Admins
Posts: 205
Member No.: 14
Joined: 10-April 11









Hmm . . .

I guess you will need to become proficient in X and write your own custom xorg.conf file.

As to the latter, you may find it simpler just to restart the X server (<Ctrl><Alt><Backspace>) rather then repeatedly reboot the system. Perhaps a change of video controller card would be the best long-term solution?

Perhaps other forum members may have some suggestions . . . unsure.gif

--------------------
user posted image 100% Linux and, previously, Unix. Co-founder of the ELRepo Project.
PMUsers Website
^
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

Topic Options Reply to this topicStart new topicStart Poll