관리-도구
편집 파일: universaldetector.cpython-312.pyc
� G��e : � �� � d Z ddlZddlZddlZddlmZmZmZ ddlm Z ddl mZ ddlm Z mZmZ ddlmZ dd lmZ dd lmZ ddlmZ ddlmZ dd lmZ ddlmZ G d� d� Zy)a Module containing the UniversalDetector detector class, which is the primary class a user of ``chardet`` should use. :author: Mark Pilgrim (initial port to Python) :author: Shy Shalom (original C code) :author: Dan Blanchard (major refactoring for 3.0) :author: Ian Cordasco � N)�List�Optional�Union� )�CharSetGroupProber)� CharSetProber)� InputState�LanguageFilter�ProbingState)�EscCharSetProber)�Latin1Prober)�MacRomanProber)�MBCSGroupProber)� ResultDict)�SBCSGroupProber)� UTF1632Proberc �N � e Zd ZdZdZ ej d� Z ej d� Z ej d� Z dddd d ddd d�Z dddddddd�Zej dfdededdfd�Zedefd�� Zedefd�� Zedee fd�� Zd!d�Zdeeef ddfd�Zdefd �Zy)"�UniversalDetectoraq The ``UniversalDetector`` class underlies the ``chardet.detect`` function and coordinates all of the different charset probers. To get a ``dict`` containing an encoding and its confidence, you can simply run: .. code:: u = UniversalDetector() u.feed(some_bytes) u.close() detected = u.result g�������?s [�-�]s (|~{)s [�-�]zWindows-1252zWindows-1250zWindows-1251zWindows-1256zWindows-1253zWindows-1255zWindows-1254zWindows-1257)� iso-8859-1z iso-8859-2z iso-8859-5z iso-8859-6z iso-8859-7z iso-8859-8� iso-8859-9ziso-8859-13zISO-8859-11�GB18030�CP949�UTF-16)�asciir ztis-620r �gb2312zeuc-krzutf-16leF�lang_filter�should_rename_legacy�returnNc � � d | _ d | _ g | _ d dd d�| _ d| _ d| _ t j | _ d| _ || _ t j t � | _ d| _ || _ | j# � y )N� ��encoding� confidence�languageF� )�_esc_charset_prober�_utf1632_prober�_charset_probers�result�done� _got_datar � PURE_ASCII�_input_state� _last_charr �logging� getLogger�__name__�logger�_has_win_bytesr �reset)�selfr r s ��/builddir/build/BUILDROOT/alt-python312-pip-23.3.1-1.el8.x86_64/opt/alt/python312/lib/python3.12/site-packages/pip/_vendor/chardet/universaldetector.py�__init__zUniversalDetector.__init__d s� � � @D�� �8<���57������# ��� �� ����&�1�1������&����'�'��1���#���$8��!�� � �r% c � � | j S �N)r- �r5 s r6 �input_statezUniversalDetector.input_state{ s � �� � � r% c � � | j S r9 )r3 r: s r6 � has_win_byteszUniversalDetector.has_win_bytes s � ��"�"�"r% c � � | j S r9 )r( r: s r6 �charset_probersz!UniversalDetector.charset_probers� s � ��$�$�$r% c �V � dddd�| _ d| _ d| _ d| _ t j | _ d| _ | j r| j j � | j r| j j � | j D ] }|j � � y)z� Reset the UniversalDetector and all of its probers back to their initial states. This is called by ``__init__``, so you only need to call this directly in between analyses of different documents. Nr r! Fr% )r) r* r+ r3 r r, r- r. r&