Hello everyone. I am building a websocket client app that receives all the data and inserts it into a MongoDB database. The program runs, but after a while, some exceptions are rising.

import websocket
import _thread
import time
import json
import asyncio

import os
import pwd
import socket
import traceback as tb

import pymongo
from datetime import datetime
import threading
import ssl

httpthreat = mongo['httpthreat']
collection1 = httpthreat['collection1']
websocket_link = "wss://somewebservice.com/service"

def get_data(HOST,PORT,TIMEOUT=5,RETRY=3):
    for itry in range(1,RETRY+1):
        try:   
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.settimeout(TIMEOUT)
            sock.connect((HOST, PORT))

            # processing some data
            # processing some data
            # processing some data

            return str(data,'utf-8')
        except Exception as e:
            traceback_str = ''.join(tb.format_exception(None, e, e.__traceback__))
            print("get_data exception")
            time.sleep(TIMEOUT)
            print(traceback_str)
    return None


def process_data(IP,PORT):
    try:
        data = get_data(IP,int(PORT))
        collection1.insert_one({"ip":IP,"port":PORT,"data":data,"date":datetime.now()}) # <==== this line throws exceptions

        # i even tried this, but no use:
        '''
        for i in range(5):
            try:
                collection1.insert_one({"ip":IP,"port":PORT,"data":data,"date":datetime.now()}) # <==== this line throws exceptions anyway
                break
            except pymongo.errors.AutoReconnect:
                print("pymongo.errors.AutoReconnect exception ["+str(i)+"]")
                time.sleep(pow(2, i))
        '''


    except Exception as e:
        traceback_str = ''.join(tb.format_exception(None, e, e.__traceback__))
        print("process_data exception")
        print(traceback_str)


def on_message(ws, message):
    message_json = json.loads(message)
    if(message_json['protocol'] == "http"):
        try:
            threads = [threading.Thread(target=process_data, args=(message_json['ip'], message_json['port']))]
            for thread in threads:
                thread.start()
            #for thread in threads:
                #thread.join() # waits for thread to complete its task <==== if i uncomment this, program slows
        except Exception as e:
            print("on_message exception")
            traceback_str = ''.join(tb.format_exception(None, e, e.__traceback__))
            print(traceback_str)

def on_error(ws, error):
    print(error)

def on_close(ws, close_status_code, close_msg):
    print("### closed ###")

    time.sleep(1)
    websocket.enableTrace(True)
    ws = websocket.WebSocketApp(websocket_link,
                              on_open=on_open,
                              on_message=on_message,
                              on_error=on_error,
                              on_close=on_close)
    ws.daemon = True
    threading.Thread(target=ws.run_forever(ping_interval=70, ping_timeout=10,sslopt={"cert_reqs": ssl.CERT_NONE}))

def on_open(ws):
    print("Opened connection")

if __name__ == "__main__":
    while True:
        try:
            print("Starting")
            websocket.enableTrace(True)
            ws = websocket.WebSocketApp(websocket_link,
                                      on_open=on_open,
                                      on_message=on_message,
                                      on_error=on_error,
                                      on_close=on_close)
            ws.daemon = True
            threading.Thread(target=ws.run_forever(ping_interval=70, ping_timeout=10,sslopt={"cert_reqs": ssl.CERT_NONE}))

        except Exception as e:
            traceback_str = ''.join(tb.format_exception(None, e, e.__traceback__))
            print("main exception")
            print(traceback_str)

        print("Restarting ...")
        time.sleep(1)

I am usining websocket library and python3

Errors that i am getting :

Error1 (occurance interval 10 minutes):

ping/pong timed out
send: b'\x88\x82c\xa0\xc1\xc0`H'
### closed ###
[Errno 9] Bad file descriptor
### closed ###
--- request header ---
GET /ws/service HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: somewebservice.net
Origin: http://somewebservice.net
Sec-WebSocket-Key: Rz2+MPdkcq0zgQqgQN/c7w==
Sec-WebSocket-Version: 13


-----------------------
--- response header ---
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: xRBxqrDNUMz7eNe1hwCxe11Pkxk=
-----------------------
Opened connection

I don't understand why this occurs at this interval, if i load the socket in browser, this never happen

Error2 (occurance interval about 10 minutes):

send: b'\x88\x82\xe4\xadP\xfd\xe7E'
error from callback <function on_close at 0x7fe2e5b4b9d8>: [Errno 9] Bad file descriptor
  File "/usr/lib/python3/dist-packages/websocket/_app.py", line 335, in _callback
    callback(self, *args)
  File "script.py", line 171, in on_close
    threading.Thread(target=ws.run_forever(ping_interval=70, ping_timeout=10,sslopt={"cert_reqs": ssl.CERT_NONE}))
  File "/usr/lib/python3/dist-packages/websocket/_app.py", line 302, in run_forever
    teardown()
  File "/usr/lib/python3/dist-packages/websocket/_app.py", line 226, in teardown
    self.sock.close()
  File "/usr/lib/python3/dist-packages/websocket/_core.py", line 420, in close
    self.shutdown()
  File "/usr/lib/python3/dist-packages/websocket/_core.py", line 432, in shutdown
    self.sock.close()
  File "/usr/lib/python3.7/socket.py", line 420, in close
    self._real_close()
  File "/usr/lib/python3.7/ssl.py", line 1108, in _real_close
    super()._real_close()
  File "/usr/lib/python3.7/socket.py", line 414, in _real_close
    _ss.close(self)
--- request header ---
GET /ws/service HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: something.net

When this happens, it triggers restarting of the websocket.WebSocketApp

I am also getting a lot of 'pymongo' errors

Error3:

Traceback (most recent call last):
  File "script.py", line 100, in process_data
    something.insert_one({"ip":IP,"port":PORT,"date":datetime.now()})
  File "/usr/local/lib/python3.7/dist-packages/pymongo/collection.py", line 613, in insert_one
    comment=comment,
  File "/usr/local/lib/python3.7/dist-packages/pymongo/collection.py", line 547, in _insert_one
    self.__database.client._retryable_write(acknowledged, _insert_command, session)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/mongo_client.py", line 1399, in _retryable_write
    return self._retry_with_session(retryable, func, s, None)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/mongo_client.py", line 1286, in _retry_with_session
    return self._retry_internal(retryable, func, session, bulk)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/mongo_client.py", line 1320, in _retry_internal
    return func(session, sock_info, retryable)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/collection.py", line 542, in _insert_command
    retryable_write=retryable_write,
  File "/usr/local/lib/python3.7/dist-packages/pymongo/pool.py", line 770, in command
    self._raise_connection_failure(error)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/pool.py", line 764, in command
    exhaust_allowed=exhaust_allowed,
  File "/usr/local/lib/python3.7/dist-packages/pymongo/network.py", line 150, in command
    reply = receive_message(sock_info, request_id)
  File "/usr/local/lib/python3.7/dist-packages/pymongo/network.py", line 213, in receive_message
    raise ProtocolError("Got response id %r but expected %r" % (response_to, request_id))
pymongo.errors.ProtocolError: Got response id 1852141647 but expected 402724286

After a while i get a lot of pymongo.errors.AutoReconnect then program does not write to the database anymore.

I don't understand how come a program can run a while then start to throw exception like crazy, especially when you connect it to a MongoDB database with pymongo. I have read about pymongo and it is not fork safe, but thread safe yes.

If i load the websocket in browser, it is simply working with no problems at all.

Any hits please?

Keep in mind I am not an expert on Python but I do know that if you close a file that isn't open, that's the error message that I've seen.
I am guessing the entire code wasn't shared because I can't seem to find the open and close code. Maybe it's buried in there but do examine your open and close pairs.
Now the following is about an entirely different system but I found we couldn't hold the SQL connection open forever. My workaround with a minor speed hit was to implement the data write as open, write, close which for that job worked.

commented: i think is a problem related to threading, like an exception messes all code, but i don't know where +0

From what I see above it looks like an incomplete code reveal. But I try anyway. I was dealing with a huge code base written by someone no longer with the company. I was just another consultant called in to see if I could "make it work." They were not interested in a rewrite but hey, they did fly me to another continent so let's give it a shot.

My suspect was the not unlike the above as in a call to close is made and the object is already closed. So I added a global variable to track the open and close of the file then added a test such as if file is open then close(). That solved that little problem but I bet some may tell me I did it wrong. When you only have so much time, you only have so much time.

Now if you have a Python method of checking if the file is open you can test for the object is open before the close.

commented: so i the code is trying to use a resoruce when is already closed, this sounds to me like a racing condition +0

Sorry if I was unclear but the code has little in the way of comments and looks incomplete so again, the error message is what I see when a close is performed on a closed object. I wrote above one method I used when I had no time to do a complete code analysis or when I'm only working with a subset of the code.

Again, if the object is closed, calling close will result in a error. Either trace your code out or put in a test to see if it object is open.

commented: Yes, you are right, the problem is that this happens in a threading enviroment so is hard to track down where the problem is +0

This is why in my example I added code to work around the issue. Your choice to understand your code or add code to avoid the problem.

Also I find a lot of newer programmers don't know to add debug statements to help understand what happens. For example you could log the open and close calls to see if there was a mismatch. But I can't know if this is all worth it when a fix could be as simple as changing the close to have an IF OPEN THEN CLOSE statement.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.