Имам Debian Lenny с vlan, bgp, pptpd, dns, ip route, tc, dhcp с около 600 потребителя но реално седят закачени около 350 pptp. С ip route има 4-ри рутирани доставчика на интернет от които един е основен с BGP-то естественно. Повече от месец машината работейки крашва и за да е по ясно ще обясня как става това:
Изведнъж забива мрежата, лан картите мигат доста по бавно, вентилаторите си работят нормално и терминала заспива. Като извадя RJ конектора на една от трите ЛАН карти и го върна, диода пак светва но след като имам забил терминал и никакъв пинг не мога да разбера какво става въобще ...
След малкото бутонче reset машината зарежда сякаш нищо не е имало и преглеждайки логовете /var/log/* навсякъде всичко е прегледно. syslog, messages и dmesg съм ги гледал ред по ред и копирвайки в формата на google за да се хвана за нещо ... всико това се случва случайно във времето между 12 часа и 7 дена интервал.
И за да е по интересно това е втората сменена конфигурация за един месец като първата на всеки краш сменях частично лан карти, рам памет, процесор, захранване, юпс, кабели, конектори докато не смених цялата машина с друга подобна но различен чипсет от G31 на G33 ...
Мисля, че се изчерпах а се чувствам ужасно глупаво (аз съм упорит човек но на фона и на недоволни клиенти ми е скапано всичко) ...
Първо ще покажа hardware:
core1:~# lspci
00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82G33/G31 Express Integrated Graphics Controller (rev 02)
00:03.0 Communication controller: Intel Corporation 82G33/G31/P35/P31 Express MEI Controller (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC-2 Gigabit Network Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 (rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 (rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IH (ICH9DH) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
02:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6101 single-port PATA133 interface (rev b2)
06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
06:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
06:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)
core1:~# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
stepping : 11
cpu MHz : 2400.170
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 4804.30
clflush size : 64
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
stepping : 11
cpu MHz : 2400.170
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 4800.55
clflush size : 64
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
stepping : 11
cpu MHz : 2400.170
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 4
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 4800.52
clflush size : 64
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
stepping : 11
cpu MHz : 2400.170
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid : 3
initial apicid : 3
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 5078.85
clflush size : 64
power management:
core1:~# free -m
total used free shared buffers cached
Mem: 3280 614 2666 0 32 106
-/+ buffers/cache: 474 2805
Swap: 1898 0 1898
core1:~# uname -a
Linux core1 2.6.26-1-686 #1 SMP Mon Dec 15 18:15:07 UTC 2008 i686 GNU/Linux
18:12:21 up 6:37, 310 users, load average: 3.66, 3.70, 3.65
core1:~# ps x | wc -l
712
core1:~# mii-tool
SIOCGMIIREG on eth0 failed: Input/output error
SIOCGMIIREG on eth0 failed: Input/output error
eth0: negotiated 1000baseT-FD flow-control, link ok
eth1: negotiated 1000baseT-FD flow-control, link ok
eth2: negotiated 1000baseT-FD flow-control, link ok
core1:~# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 4c:00:10:52:73:3e
inet addr:10.21.1.3 Bcast:10.21.1.255 Mask:255.255.255.0
inet6 addr: fe80::4e00:10ff:fe52:733e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:269457665 errors:0 dropped:114 overruns:0 frame:0
TX packets:262640379 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:3098056604 (2.8 GiB) TX bytes:3189622867 (2.9 GiB)
Memory:e0380000-e03a0000
core1:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:e0:4c:69:1c:bc
inet addr:212.233.252.130 Bcast:212.233.252.255 Mask:255.255.255.128
inet6 addr: fe80::2e0:4cff:fe69:1cbc/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:25908281 errors:0 dropped:0 overruns:0 frame:0
TX packets:24797024 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3763152504 (3.5 GiB) TX bytes:3995252507 (3.7 GiB)
Interrupt:21 Base address:0xe900
core1:~# ifconfig eth2
eth2 Link encap:Ethernet HWaddr 00:e0:4c:69:1c:b2
inet addr:172.16.10.1 Bcast:172.16.10.255 Mask:255.255.255.0
inet6 addr: fe80::2e0:4cff:fe69:1cb2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:294117121 errors:0 dropped:0 overruns:0 frame:0
TX packets:290833647 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:349273204 (333.0 MiB) TX bytes:3882010125 (3.6 GiB)
Interrupt:22 Base address:0x4800
core1:~# ifconfig vlan149
vlan149 Link encap:Ethernet HWaddr 00:e0:4c:69:1c:b2
inet addr:212.70.158.90 Bcast:212.70.158.91 Mask:255.255.255.252
inet6 addr: fe80::2e0:4cff:fe69:1cb2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:293509990 errors:0 dropped:0 overruns:0 frame:0
TX packets:290483585 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:141740777 (135.1 MiB) TX bytes:3591439227 (3.3 GiB)
core1:~# ifconfig tun0
tun0 Link encap:UNSPEC HWaddr 0A-15-01-03-C4-08-00-00-00-00-00-00-00-00-00-00
inet addr:194.141.67.110 P-t-P:194.141.67.109 Mask:255.255.255.252
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1476 Metric:1
RX packets:1028533 errors:0 dropped:0 overruns:0 frame:0
TX packets:889318 errors:2 dropped:0 overruns:0 carrier:2
collisions:0 txqueuelen:0
RX bytes:306586890 (292.3 MiB) TX bytes:121042236 (115.4 MiB)
core1:~# ifconfig tun1
tun1 Link encap:UNSPEC HWaddr 0A-15-01-03-74-08-00-00-00-00-00-00-00-00-00-00
inet addr:172.16.25.1 P-t-P:172.16.25.2 Mask:255.255.255.252
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1476 Metric:1
RX packets:37674 errors:0 dropped:0 overruns:0 frame:0
TX packets:46831 errors:3 dropped:0 overruns:0 carrier:3
collisions:0 txqueuelen:0
RX bytes:4360976 (4.1 MiB) TX bytes:58444498 (55.7 MiB)
Ще бъда благодарен ако по някакъв начин разнищим загадката ...