d0'2ddlmZmZddlmZmZmZddlmZm Z m Z ddl m Z m Z mZddlmZmZmZddlmZmZmZddlmZGd d ZGd d eZGd deZGddeZGddeZGddeZGddeZGddeZ y))TupleUnion)BIG5_CHAR_TO_FREQ_ORDERBIG5_TABLE_SIZEBIG5_TYPICAL_DISTRIBUTION_RATIO)EUCKR_CHAR_TO_FREQ_ORDEREUCKR_TABLE_SIZE EUCKR_TYPICAL_DISTRIBUTION_RATIO)EUCTW_CHAR_TO_FREQ_ORDEREUCTW_TABLE_SIZE EUCTW_TYPICAL_DISTRIBUTION_RATIO)GB2312_CHAR_TO_FREQ_ORDERGB2312_TABLE_SIZE!GB2312_TYPICAL_DISTRIBUTION_RATIO)JIS_CHAR_TO_FREQ_ORDERJIS_TABLE_SIZEJIS_TYPICAL_DISTRIBUTION_RATIO)JOHAB_TO_EUCKR_ORDER_TABLEc|eZdZdZdZdZdZddZddZd e e e fd e ddfd Z defd Zdefd Zde e e fde fdZy)CharDistributionAnalysisigGz?g{Gz?returnNct|_d|_d|_d|_d|_d|_|jy)NrgF)tuple_char_to_freq_order _table_sizetypical_distribution_ratio_done _total_chars _freq_charsresetselfs :/usr/lib/python3/dist-packages/chardet/chardistribution.py__init__z!CharDistributionAnalysis.__init__@s@5:G  +.'  c.d|_d|_d|_y)zreset analyser, clear any stateFrN)rr r!r#s r%r"zCharDistributionAnalysis.resetOs r'charchar_lenc|dk(r|j|}nd}|dk\rN|xjdz c_||jkr)d|j|kDr|xjdz c_yyyy)z"feed a character with known lengthrriN) get_orderr rrr!)r$r)r*orders r%feedzCharDistributionAnalysis.feedXsu q=NN4(EE A:    " t'''11%88$$)$9( r'c<|jdks|j|jkr |jS|j|jk7rD|j|j|jz |jzz }||j kr|S|j S)z(return confidence based on existing datar)r r!MINIMUM_DATA_THRESHOLDSURE_NOrSURE_YES)r$rs r%get_confidencez'CharDistributionAnalysis.get_confidencefs    !T%5%59T9T%T<<     0 0 0  ""T%5%559X9XXA4== }}r'c4|j|jkDSN)r ENOUGH_DATA_THRESHOLDr#s r%got_enough_dataz(CharDistributionAnalysis.got_enough_dataws  4#=#===r'_cy)Nr-)r$r;s r%r.z"CharDistributionAnalysis.get_order|s r'rN)__name__ __module__ __qualname__r9r4r3r2r&r"rbytes bytearrayintr0floatr6boolr:r.r=r'r%rr:s{ HG  *ui/0 *C *D *">> 5 !12sr'rc:eZdZdfd ZdeeefdefdZxZ S)EUCTWDistributionAnalysisrcdt|t|_t|_t |_yr8)superr&r rr rrrr$ __class__s r%r&z"EUCTWDistributionAnalysis.__init__& #; +*J'r'byte_strc:|d}|dk\rd|dz z|dzdz Sy)Nr^rr-r=r$rN first_chars r%r.z#EUCTWDistributionAnalysis.get_order6 a[  d*+hqk9D@ @r'r> r?r@rAr&rrBrCrDr. __classcell__rLs@r%rHrH&K %y(8"9cr'rHc:eZdZdfd ZdeeefdefdZxZ S)EUCKRDistributionAnalysisrcdt|t|_t|_t |_yr8rJr&r rr rr rrKs r%r&z"EUCKRDistributionAnalysis.__init__rMr'rNc:|d}|dk\rd|dz z|dzdz Sy)NrrQrrRr-r=rSs r%r.z#EUCKRDistributionAnalysis.get_orderrUr'r>rVrXs@r%r[r[rYr'r[c:eZdZdfd ZdeeefdefdZxZ S)JOHABDistributionAnalysisrcdt|t|_t|_t |_yr8r]rKs r%r&z"JOHABDistributionAnalysis.__init__rMr'rNcl|d}d|cxkrdkr$ny|dz|dz}tj|dSy)Nrrr-)rget)r$rNrTcodes r%r.z#JOHABDistributionAnalysis.get_ordersHa[ : $ $#hqk1D-11$; ;r'r>rVrXs@r%raras&K %y(8"9cr'rac:eZdZdfd ZdeeefdefdZxZ S)GB2312DistributionAnalysisrcdt|t|_t|_t |_yr8)rJr&rrrrrrrKs r%r&z#GB2312DistributionAnalysis.__init__s& #< ,*K'r'rNcH|d|d}}|dk\r|dk\rd|dz z|zdz Sy)Nrrr_rRrQr-r=r$rNrT second_chars r%r.z$GB2312DistributionAnalysis.get_ordersA #+1+x{K $ [D%8d*+k9D@ @r'r>rVrXs@r%rjrjs&L %y(8"9cr'rjc:eZdZdfd ZdeeefdefdZxZ S)Big5DistributionAnalysisrcdt|t|_t|_t |_yr8)rJr&rrrrrrrKs r%r&z!Big5DistributionAnalysis.__init__s& #: **I'r'rNcj|d|d}}|dk\r$|dk\rd|dz z|zdz dzSd|dz z|zdz Sy) NrrrR?@r-r=rms r%r.z"Big5DistributionAnalysis.get_ordersa #+1+x{K  d"j4/0;>EJJ*t+,{:TA Ar'r>rVrXs@r%rprps&J %y(8"9 c r'rpc:eZdZdfd ZdeeefdefdZxZ S)SJISDistributionAnalysisrcdt|t|_t|_t |_yr8rJr&rrrrrrrKs r%r&z!SJISDistributionAnalysis.__init__& #9 )*H'r'rNc|d|d}}d|cxkrdkr nn d|dz z}nd|cxkrdkrny d|dz dzz}ny ||zd z }|d kDrd }|S) Nrrr-rvr=)r$rNrTrnr/s r%r.z"SJISDistributionAnalysis.get_orders} #+1+x{K : % %:,-E Z '4 ':,r12E #d*  E r'r>rVrXs@r%rxrxs&I %y(8"9cr'rxc:eZdZdfd ZdeeefdefdZxZ S)EUCJPDistributionAnalysisrcdt|t|_t|_t |_yr8rzrKs r%r&z"EUCJPDistributionAnalysis.__init__r{r'rNc:|d}|dk\rd|dz z|dzdz Sy)NrrQrRrr-r=)r$rNr)s r%r.z#EUCJPDistributionAnalysis.get_orders4 { 4<% 3d: :r'r>rVrXs@r%rrs&I %y(8"9cr'rN)!typingrrbig5freqrrr euckrfreqr r r euctwfreqr r r gb2312freqrrrjisfreqrrr johabfreqrrrHr[rarjrprxrr=r'r%rs8      2GGT 8$ 8$  8 !9$7(72 8r'