Problem with freezing receiver (software or hardware)???

  • Hello,


    My configuration is:
    DM 7020HD v1 - bought circle about 3-5 months ago
    - DVBS2 , DVB-C/T
    - images (Merlin-3_OE_2.0-dm7020hd-20130303, release-dm7020hd-4.0.0)
    - GP 3.2
    - BarryAllen 10.0.3 + USB 16 GB
    - hdd 1.0 TB


    Since about 2 weeks I have problem with DM7020 freezing.
    It happens late at night (between 01:00-03:00) always when receiver is not used (see picture: freezed_display.jpg).


    [Blockierte Grafik: http://imageshack.us/a/img546/1977/l2ee.jpg]
    Because the receiver do not react on remote button I try to restart enigma2 via telnet but receiver responds enigma2: no process found (see attachment: enigma2_no_process.jpg) so I reboot it with telnet command.


    [Blockierte Grafik: http://imageshack.us/a/img198/3334/n5l0.jpg]
    During boot sequence receiver freezes in the middle of sequence (see attachment: boot_sequence.jpg).
    [Blockierte Grafik: http://imageshack.us/a/img713/3166/g7nc.jpg]


    During bins execution via telnet the receiver returns a lot of errors (see attachment: bins_execution_errors.jpg).
    [Blockierte Grafik: http://imageshack.us/a/img10/3158/l970.jpg]


    OPKG update/ upgrade is also impossible (attachement: opkg.jpg).
    [Blockierte Grafik: http://imageshack.us/a/img856/5766/u4t8.jpg]


    The only thing which helps is flashing but I don't see this every third day.
    What is wrong with my receiver and/ or configuration? What can I do?

    "Reasonable people adapt themselves to the world.
    Unreasonable people attempt to adapt the world to themselves.
    All progress, therefore, depends on unreasonable people"


    ============
    SubsDownloader

    Einmal editiert, zuletzt von silelis ()

  • Is your receiver accessible via Internet?
    If yes, I guess that you get hacked by a bot net crawler.
    Did you tried to boot WITHOUT Barry Allen (read: fresh
    boot from a fresh image installed in boxes flash)?


    Btw. After "killall", you need to start enigma2 by hand.
    It will not automagically start.
    Use "init 4" on the command line to stop and "init 3" to
    start enigma2.

    DM900 SS, DM8000SSSS
    Kein Support per PN! Nutzt das Forum zum Fragen, dann haben auch andere etwas davon.

  • Could you please post (separately and in code tags) the output of the following commands (executed when the problems pop up again). Make sure to have puTTy set to fullscreen with a large scrollback buffer:

    Code
    ps aux


    Code
    dmesg


    Code
    cat /var/log/messages


    Code
    cat /home/root/.ssh/known_hosts


    Code
    cat /home/root/.ash_history


    Code
    crontab -l


    From your description I would not guess botnet, but ubifs/filesystem trouble.
    Some of the above commands might be helpful in ruling out an infection.
    To eliminate the second flash as a culprit, make sure to follow the steps kenat described.

  • Bellow I also addes S.M.A.R.T. hdd test ( I can't read the output but HDD problems may be one of the reasons).


    [Blockierte Grafik: http://imageshack.us/a/img850/2463/t1ec.jpg]



    WilliamG the output that You all telnet commands were given after unfinished reboot (during reboot tuner is frozen).


    ps aux dmesg cat_var_log_messages output is to long so I added it as attachements at the end.



    cat_home_root_.ash_history:

    Code
    cat: can't open '/home/root/.ash_history': No such file or directory


    cat_home_root_.ssh_known_hosts:

    Code
    cat: can't open '/home/root/.ssh/known_hosts': No such file or directory


    crontab -1

    Code
    crontab: can't open 'root': No such file or directory
  • According to the output of SMART, I would recommend to disable the HD
    (e.g. remove the SATA connector from the disk) and retry a start.
    If the box starts, the HD is most probably the problem.
    Although the disk is not known to the smart tool, the values are not very
    promising (to say at least). Value "pre-fail" is always a bad sign.

    DM900 SS, DM8000SSSS
    Kein Support per PN! Nutzt das Forum zum Fragen, dann haben auch andere etwas davon.

  • Hmm.. pretty strange that there is no bash history.
    Could you please post the output of cat /etc/cron/crontabs?
    I guess the VPN connections (dawid, daniel, lupa2) are intentional and not malicious?


    Would be interesting to know if the problems persist without BA and openvpn.
    For further debugging you could log the box's serial output (via the service port). Maybe the kernel messages just before the freeze could shed some light on the situation.


    €dit: The S.M.A.R.T. values look okay for an older hdd. They are still far enough from their thresholds.
    Not sure why kenat thinks otherwise.

    Einmal editiert, zuletzt von WilliamG ()

  • Good that Barry allen 10* is not supported anymore, so I don't need to offer help.

  • Regarding the interpretation of S.M.A.R.T. results, the smartctl manpage is quite helpful.
    A small excerpt:

    Zitat


    Each Attribute also has a Threshold value (whose range is 0 to 255) which is printed under the heading "THRESH". If the Normalized value is less than or equal to the Threshold value, then the Attribute is said to have failed. If the Attribute is a pre-failure Attribute, then disk failure is imminent.
    [...]
    The Attribute table printed out by smartctl also shows the "TYPE" of the Attribute. Attributes are one of two possible types: Pre-failure or Old age. Pre-failure Attributes are ones which, if less than or equal to their threshold values, indicate pending disk failure. Old age, or usage Attributes, are ones which indicate end-of-product life from old-age or normal aging and wearout, if the Attribute value is less than or equal to the threshold. Please note: the fact that an Attribute is of type 'Pre-fail' does not mean that your disk is about to fail! It only has this meaning if the Attribute's current Normalized value is less than or equal to the threshold value.

  • I will try to test HDD with Seatool on PC.


    How can I

    For further debugging you could log the box's serial output (via the service port). Maybe the kernel messages just before the freeze could shed some light on the situation.

    How to do this?

    "Reasonable people adapt themselves to the world.
    Unreasonable people attempt to adapt the world to themselves.
    All progress, therefore, depends on unreasonable people"


    ============
    SubsDownloader

  • Could not find an English bootlog howto with a quick google search. Here's the long version (German).
    Short version:

    • Connect your box to your PC (via the service port) using a mini-USB cable.
    • When using Windows, install the USB to UART drivers.
    • Connect to the new virtual COM port using a suitable application (e.g. PuTTy, Hyperterminal, Minicom or GNU screen) with these settings:

      Code
      Bps:				115200
      Databits:			8
      Parity:				No
      Stopbits:			1
      Flow control:			Hardware


    • Boot the box.


    It would be helpful if you could also answer the questions I posted above :smiling_face:

  • WI made DM flashing by DreamUp.


    I wonder if bad sector could be the reason. Bellow I put DreamUp output:

    "Reasonable people adapt themselves to the world.
    Unreasonable people attempt to adapt the world to themselves.
    All progress, therefore, depends on unreasonable people"


    ============
    SubsDownloader

  • ...
    €dit: The S.M.A.R.T. values look okay for an older hdd. They are still far enough from their thresholds.
    Not sure why kenat thinks otherwise.


    Because the RAW_READ_ERROR_RATE seems unreasonably high.
    The high value of seek value errors seems to be "usual" for seagate drives
    but I am not sure with read errors.
    A simple test (boot without HD) would reveal, if this is the cause of the
    problems or not.

    DM900 SS, DM8000SSSS
    Kein Support per PN! Nutzt das Forum zum Fragen, dann haben auch andere etwas davon.

  • Yesterday have checked HDD with Seatool (producer tool) it passed long test without any error.


    For me the only explanation are flash bad blocks. - I will see.


    BTW Should I do somethink if I noticed this two bad blocks? I mean:


    Code
    18:16:40 Log: +++ 006 erase failed at 1dcc0000: c1. Blockwill be marked as bad.
    18:16:49 Log: +++ 006 erase failed at 278c0000: c1. Blockwill be marked as bad.



    Is it basis to guarantees?

    "Reasonable people adapt themselves to the world.
    Unreasonable people attempt to adapt the world to themselves.
    All progress, therefore, depends on unreasonable people"


    ============
    SubsDownloader

  • Zitat

    crontab -1

    Not 1... use the l

    root@dm7020hd:~# crontab -l
    # DO NOT EDIT THIS FILE - edit the master and reinstall.
    # (/tmp/crontab.16563 installed on Fri Sep 20 00:10:02 2013)
    # (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $)
    00 12 * * * /usr/script/moviecleaner.sh
    00 19 15 * * /usr/script/backup.sh
    00,15,30,45 23,00,1,2,3,4,5 * * * /usr/script/shutdown.sh
    root@dm7020hd:~#


    Next question:
    What is /bin/mi ?


    Also try

    Code
    for user in $(cut -f1 -d: /etc/passwd); do crontab -u $user -l; done


    This should you show all crontabs for all users..


    root@dm7020hd:/etc/cron/crontabs# for user in $(cut -f1 -d: /etc/passwd); do crontab -u $user -l; done
    # DO NOT EDIT THIS FILE - edit the master and reinstall.
    # (/tmp/crontab.16563 installed on Fri Sep 20 00:10:02 2013)
    # (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $)
    00 12 * * * /usr/script/moviecleaner.sh
    00 19 15 * * /usr/script/backup.sh
    00,15,30,45 23,00,1,2,3,4,5 * * * /usr/script/shutdown.sh
    no crontab for daemon
    no crontab for bin
    no crontab for sys
    no crontab for sync
    no crontab for games
    no crontab for man
    no crontab for lp
    no crontab for mail
    ......




    also i miss the answer for this question:

    Zitat

    I guess the VPN connections (dawid, daniel, lupa2) are intentional and not malicious?


    --
    openwrt + minicom + screen = 24/7 Bootlog

    3 Mal editiert, zuletzt von Schnello ()

  • I guess the VPN connections (dawid, daniel, lupa2) are intentional and not malicious?

    VPN connections works well (till the time that receiver freezes). But I also use Transmission plugin - maybe the ethernet module is a problem but I don't know how to check it?


    WilliamG
    To eliminate that the problem would be the software (oryginal DMM image + GP3.2) on 24.09.2013 I have installed Newnigma 4.0.4 image (it was something about 20:00). Unfortunately after on 25.09.2013 10:44 once again receiver "frozen".
    This time after receiver froze I (before power off restart) I made telnet commands (ps aux, dmesg, cat /var/log/messages, ,cat /home/root/.ssh/known_hosts, cat /home/root/.ash_history, crontab -l) - logs in attachment. Maybe You will notice something.
    [Blockierte Grafik: http://imageshack.us/a/img850/6311/mlzw.jpg]


    The one thing I don't understand is why directly after tunre froze the telnet gives me sutch strange output:
    [Blockierte Grafik: http://imageshack.us/a/img10/3158/l970.jpg]
    For me it looks like flash do not hold data (some data erase).

  • To other who will have similar problem.
    It was hardware failure. I sent the receiver to DMM and they gave me new item.


    BTW how can I find out what is hardware version? I want to flash it with correct image version.

    "Reasonable people adapt themselves to the world.
    Unreasonable people attempt to adapt the world to themselves.
    All progress, therefore, depends on unreasonable people"


    ============
    SubsDownloader

  • As far as I know, the bootloader prevents you from installing an image for the wrong version of the box. So I guess it's trial and error for you.

    How can we win, when fools can be kings?